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THE  USE  OF  APTITUDE  TESTS  IN  THE  SELECTION 
OF  RADIO  TUBE  MOUNTERS 

I.  Introduction  and  Statement  of  the  Problem 


In  April,  1942,  a  Vestibule  Training 
School  was  organized  at  the  Harrison, 
New  Jersey,  Plant  of  the  Radio  Corpora- 
tion of  America,  in  order  to  train  selected 
new  employees  quickly  in  the  basic  skills 
involved  in  the  assembly  of  radio  tube 
mounts  and  to  place  graduates  in  the  fac- 
tory in  accordance  with  the  level  of  abil- 
ity demonstrated  during  the  training  pe- 
riod. 

The  instructors  soon  discovered  that 
individuals  differed  markedly  in  ability 
to  perform  the  operations  used  for  train- 
ing purposes  in  the  school.  Specifically,  it 
appeared  that  the  existing  employment 
procedures  did  not  provide  for  a  suffi- 
ciently precise  evaluation  of  the  manipu- 
lative skills  required  for  successful  per- 
formance.1 


1  The  reader  who  is  familiar  with  even  a  small 
part  of  the  literature  and  research  concerning 
the  nature  and  magnitude  of  individual  differ- 
ences and  the  difficulties  frequently  involved  in 
their  detection  and  evaluation,  will  not  interpret 
these  observations  as  a  reflection  upon  the  com- 
petence of  those  charged  with  responsibility  for 
selection  and  placement. 

Griffitts  (7),  for  example,  studied  the  relation- 
ship between  twelve  anatomical  measurements, 
including  height,  weight,  hand  measurements 
and  various  ratios  thereof,  and  performance  on  a 
total  of  twelve  manipulative  tasks.  Despite  the 
fait  that  the  measures  of  manipulative  ability 
had  an  average  reliability  of  .85,  all  correlations 
between  peilormance  and  the  anthropometric 
measurements  were  low.  Only  two  of  the  144 
coefficients  "may  .  .  .  have  statistical  significance." 
Tiffin  (18),  commenting  upon  this  and  other 
studies,  says  in  part: 

Some  employment  managers  judge  the  dex- 
terity of  an  applicant  by  examining  his  hands 
and  fingers.  .  .  .  Perhaps  in  extreme  cases, 
where  an  applicant  lias  fingers  that  are  stiff 
or  very  stubby,  one  could  predict  from  an  ex- 
amination of  his  hands  that  he  would  probabl) 
be  low  in  finger  dexterity;  but  in  the  great 
majority  of  cases  such  a  judgment  would  be 
no  more  than  a  guess.  What  an  applicant  can 
do  with  his  hands,  not  the  appearance  of  the 
hands,  determines  his  qualifications  for  a  man- 
ual dexterity  job. 
Moreover,  there  is  ample  evidence  to  show  not 


These  and  other  lac  is  wen-  taken  to 
indicate  the  desirability  of  determining 

experimentally  the  value  of  certain 
manipulative  aptitude  tests  in  the  selec- 
tion of  trainees  for  the  Vestibule  Train- 
ing School. 

After  two  preliminary  experiments  in 
the  school  and  one  with  experienced 
mounters  in  the  factory,  the  follow-up 
method  was  adopted  and  used  almost  ex- 
clusively. The  program  called  for  testing 
prospective  mounters  at  the  time  of  hir- 
ing, and  for  later  correlation  of  the  scores 
with  criteria  of  performance  in  the  school 
and  in  the  factory. 

It  was  the  original  intention  of  this 
experimenter  to  present  all  the  data  pro- 
cured in  the  course  of  the  research  with 
mounters.  The  approach  to  the  problem, 
however,  was  one  of  continuous  and, 
somewhat  later,  intermittent  research.  A 
number  of  experimental  tests  were  ad- 
ministered even  after  the  initial  battery 
found  application  in  the  employment 
office.  Furthermore,  many  of  the  deci- 
sions affecting  the  research  with  mountei  s 
were  determined  by  the  possibilities  and 

only  that  the  correlations  between  various  meas- 
ures of  dexterity  and  intelligence  are  generally 
low  (3),  (12),  (13),  (20),  but  also  that  dexterity 
is  more  or  less  unrelated  to  other  abilities.  Thus 
Harrell  (9)  reports  that  "manual  agility"  is  a 
factor  separate  from  "perception  of  detail."  "ver- 
bal relations,"  "visualizing  spatial  relations"  and 
"youth."  This  conclusion  is  based  upon  an  anar) 
sis  of  the  scores  on  thirty-four  variables  includ- 
ing various  manual,  spatial  and  verbal  tests  and 
such  personal  data  as  age,  school  grade  com- 
pleted, experiem  c  on  mechanical  johs  and  simei 
visory  ratings.  Ninety-one  cotton  mill  machine- 
lixci  s  set  ved  as  subjec  ts. 

It  appears,  therefore,  that  interviewers  should 
not  be  expected  to  evaluate  the  manipulative 
abilities  of  applicants  except,  perhaps,  when  a 
reliable  report  on  past  performance  in  similai 
or  identical  work  is  available  to  demonstrate 
possession  of  an  adequate  amount  of  the  required 
skills. 
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practical   demands   of   the   situation   in  ance  on  mounting  jobs  in  the  factory, 

which  the  work  was  done,  as  well  as  by  While  these  data  do  not  constitute  the 

the  broader  testing  program  which  was  total  amount  of  validation  on  these  tests, 

being   developed    simultaneously.    It    is  the  data  presented  and  the  conclusions 

necessary,  therefore,  to  select  for  presen-  reached   represent    the  more   important 

tation  only  a  portion  of  the  data  perti-  practical  results  which  emerged  from  the 

nent    to   the  use  of  aptitude  tests  in   the  researc  h  up  to  thai  time. 

selection  of  mounters.  A    review   of   the   literature   indicates 

We  will  concern  ourselves  with  the  thai  to  date  only  one  study  dealing  with 
Stud)  of  -•;;;;  female  trainees  who  were  the  selection  of  mounters  has  been  pub- 
tested  and  hired  between  May  17th  and  lished.  Forlano  and  Kirkpatrick  (2), 
September  15th,  1943,  and  who  satisfy  working  with  twenty  mounters  from  an- 
ce Tiain  other  criteria  itemized  in  Chapter  other  plant  of  the  Radio  Corporation  of 
IV  of  this  report.  America,  found  "A  composite  of  intelli- 

Utilizing  the  Wherry-Doolittlc  method,  gence  and  personality  scores  ...  to  be 

two  regression  equations  will  be  devel-  effective   in   predicting   the    subsequent 

oped  to  predict  performance  in  the  Vesti-  success  of  new  tube  mounters." 

bule  Training  School.  The  predictions  The  present  report  will  be  confined  to 

made  by  these  equations  will  be  further  an  evaluation  of  five  standard  manipula- 

validated    against    subsequent    perform-  tion  tests  for  this  situation. 


II.   Job  Analysis 


Tin  assembly  of  mounts  for  the  receiv- 
ing and  allied  type  tubes  which  are 
manufactured  at  the  Harrison  Plant,  in- 
volves the  careful  manipulation  and  posi- 
tioning of  small  and  often  delicate  parts 
with   the   fingers   and   tweezers.  A  very 
high    percentage    of    these    jobs    require 
the  use  of  a  bench  type,  resistance  welder. 
When    the    Vestibule    Training    School 
was  organized,  therefore,  three  relatively 
simple  assemblies  were  selected  for  fabri- 
cation by  the  process  of  resistance  weld- 
ing, on  the  grounds  that  they  embodied 
skills  which  are  basic  to  most  mounting 
operations.  In  order  to  provide  a  situa- 
tion  favorable  to  the  rapid  acquisition 
of  these  skills  and,  at  the  same  time,  to 
measure    the    relative    ability    of    each 
trainee  to  acquire  them,  the  procedures 
followed   resembled   those   employed    in 
the  administration  of  aptitude  tests  of  the 
manipulative  type.  After  a  period  of  in- 
struction and  supervised  practice   with  a 
particular    assembly,    the    trainees    were 
given  a  series  of  production  tests  of  one 
hour  duration.  At  the  conclusion  of  each 
test,  the  number  of  units  produced  was 
recorded  and  the  work  inspected.  While 
there  were  a  number  of  factors  (pp.  nf) 
in  the  situation  which  precluded  the  use 
ol  these-  data  as  criteria,  it  will  be  shown 
that  the)  provided  the  instructors  with  a 
reasonably    sound    basis    for    evaluating 
trainee  p<  1  lormance. 

In  order   that  the  operational  signifi- 
cance   of    the    experimental    results    ma\ 
be  understood  by  the  reader  who  is  not 
Eamiliar  with  the  process  ol   assembling 
small  metal  parts  by  resistance  welding, 
the  process,  the  equipment  and  the  opei  a- 
tions  performed  will  be  described  in  some 
detail.  Results  of  the  training  procedure 
will  then  be  examined   lot    their  use-  in 
selection. 


A.  THE  OP]  RA  I  IONS   PERFORM]  l> 

i.   Resistance     Welding:    Equipment 

a 1 1 (I  I'rocess. 

The  joining  of  two  metal  parts  l>\  re 
sistance  welding  is  accomplished  b\   the 
application  of  a  potential  difference  in 
su.  h  a  way  that  an  electric  current  passes 
Erom  one  part  to  the  other  at  the  point 
(small   area)  where  the  weld   is  desired. 
The  heat  developed   is  proportional   to 
the  resistance  of  the  path,  the  square  of 
the   current  and   its  action   time,   which 
depend  in  turn  upon  other  factors,  such 
as  the  mechanical  pressure  exerted  on  the 
parts  and  the  properties  of  the  materials 
themselves.  Under  proper  conditions,  the 
current  encounters  maximum  resistance 
in  passing  from  one  part  to  the  other.  II 
sufficient,  but  not  excessive,  heat  is  de- 
veloped at  this  point,  the  parts  will  be 
welded.    Resistance    welding    is    compli- 
cated by  the  difficulties  invoked  in  local- 
izing and  controlling  the  fusion  ol   the 
two  metals  in   the  desired   area   and    in 
avoiding  burning  (excessive   fusion   and 
oxidation)    or   welding   of   the   parts    to 
the  electrodes.  Most  of  these   problems 
arc  the  concern  of  engineers,  supci  \  isen  s 
and  setup  men  and  will  not  be  discussed. 
Of  immediate  interest  are  the  operations 
performed   and   the  process   factors  .on- 
trolled  by  the  trainee. 

Figure  i  shows  a  bench  type  resistano 
welder,    mounted    on    table    I  V).    Two 
crossed  lengths  of  wire  (X)  and  (Y),  an 
positioned  for  welding  between  the-  sta 
tionary  electrode  (A)  and   the-   movable 

i  he  author  wishes  to  thank  Mr.  J.  R.  Gates, 
welding  engineer,  who  reviewed  this  section  of 
the  minuslript  for  technical  accuracy.  Over  a 
period  ol  several  years,  numerous  supervisors 
Pnd  machi„e  attendants  have  given i  freely  of  then 
experience  with  welding  problems.  In  this  con 
nection,  thanks  an  especially  due  to  Mr.  I.  j. 
Pilas. 
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electrode  (B).  II  the  drawing  were  com- 
plete, it  would  show  an  operator  seated 
in  an  adjustable  posture  chair,  holding 
wire  (X)  with  a  pair  of  tweezers  grasped 
in  the  right  hand,  ami  wire  (V)  in  the 
fingers  of  the  left  hand.  The  operator's 
let  i  would  be  located  on  the  left  (L)  and 
right  (R)  pedals,  respectively.  The  pedal 
(R)  controls  the  movement  of  die  elec- 
trode (B)  in  means  of  a  series  of  me- 
chanical linkages  which  are  represented 
in  highly  conventionalized  form  in  the 
sketch.  The  chain  (F),  which  is  attached 
to  the  pedal  (R),  passes  over  pulley  (H) 
to  the  lever  arm  (K).  Although  (K)  is 
mounted  independently  of  the  swinging 


arm  (D)  on  the  shaft  (M),  the  two  parts 
are  connected  l>v  the  spring  (E).  The 
drawing  shows  the  right  pedal  depressed 
just  enough  to  bring  the  movable  elec- 
trode into  contact  with  the  work.  With- 
out pausing,  the  operator  continues  the 
downward  stroke  of  the  right  pedal  be- 
yond this  point  until  the  movement  of 
(K)  is  arrested  by  the  stop  (G).  Since  the 
two  lengths  of  metal  wire  are  relatively 
incompressible  when  cold,  depression  of 
the  pedal  results  in  the  extension  of 
spring  (E).  In  terms  of  the  conventional- 
ized drawing,  the  exact  pressure  exerted 
on  the  parts  at  the  moment  of  welding 
can  be   controlled   by  selecting  and  ad- 
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Fig.  i.  Conventional  Drawing  of  Bench  Type  Resistance  Welder. 
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justing  the  spring  (E)  and  by  setting  the 
switch  (not  shown),  which  initiates  the 
current  flow  between  the  electrodes,  so 
that  it  closes  when  the  proper  spring 
tension  is  reached.  The  action  time  of  the 
c  m lent  is  determined  with  a  high  degree 
of  precision  by  the  electronic  control 
panel  (P)  and  does  not  depend  upon  the 
speed  with  which  the  pedal  (R)  is  dc- 
pressed  or  upon  the  quick  withdrawal 
of  the  foot. 

Thus  depression  of  pedal  (R)  serves  to 
bring  the  movable  electrode  (B)  into  con- 
tact with  the  parts  to  be  welded  and,  in 
addition,  closes  the  circuit  which  pro- 
vides the  current  needed  to  complete  the 
weld. 

The  left  pedal  has  only  one  function. 
It  is  used  for  welds  requiring  current  of 
different  (usually  greater)  intensity.  Some 
welds  can  be  made  with  the  right  pedal 
alone,  while  others  require  that  the  left 
pedal  be  depressed  before  the  right  pedal 
is  used  to  bring  down  the  movable  elec- 
trode and  complete  the  weld.  Incorrect 
use  of  the  pedals  may  result  in  a  cold  or 
burned  weld,  which  can  be  detected  usu- 
ally by  visual  inspection.  Since  mount- 
ing jobs  generally  require  a  combination 
of  welds  at  the  two  intensity  levels,  oper- 
ators must  follow  rigorously  the  pre- 
scribed patterns. 

2.  Process  Factors  Controlled  by  the 
Operator. 

The  operator  is  responsible  for  produc- 
ing good  welds  at  a  rate  determined  by 
time  study  techniques  lor  each  specific 
job,  and  by  strict  adherence  to  instruc- 
tions pertaining  to  the  use  of  materials 
and  equipment.  In  part,  this  responsibil- 
ity involves  the  following: 

a.  Summoning  the  machine  attendant  or 
supervisor  when  welds  are  not  coming 
through  properly.  An  experienced  oper- 
ator must  he  able  to  distinguish  a  good 
or  satisfactory  weld  from  a  pool  one. 


b.  Following  directions  pertaining  to  the 
coned  use  ol  foot  pedals,  as  explained 
above. 

c.  Avoiding  downward  pressure  on  the  sta- 
tionary electrode  (A).  Since  the  operator 
holds  the  pans  with  the  fingers  or  tweez- 
ers at  a  point  somewhat  removed  from 
the  area  of  welding,  any  downward  pres 
sure  on  cither  ol  the  parts  to  he  welded 
may  result  in  "bowing"  the  parts,  espe 
daily  if  the  pressure  is  continued  while 
the  parts  are  being  fused  and  joined. 

d.  The    careful    handling    of    pans,     i 
majority   of   radio   tuhe    parts   must   he 
handled  lightly  and  delicately  to  avoid 
physical   injury.   Parts  must   frequently 
be  grasped  at  a  specified  point. 

e.  Avoiding  chemical  contamination  ol 
parts.  Certain  parts,  and  certain  areas 
of  other  parts  may  not  be  touched  with 
the  fingers. 

f.  The  proper  positioning  of  parts  be- 
tween the  two  electrodes. 

It  will  be  noted  that  the  last  four  re- 
sponsibilities serve  to  define  the  degree 
of  control  which  the  operator  must  exer- 
cise over  the  motions  of  her  hands  and 
fingers.  Since  the  abilities  involved  in  the 
rapid,  repetitive  positioning  of  parts  be- 
tween the  two  electrodes  play  a  promi- 
nent part  in  determining  the  competence 
of  the  operator,  further  description  is 
appropriate  at  this  point. 

There  are  several  factors  which  serve 
to  set  the  limits  of  precision  within  whic  h 
the  operator  must  position  the  parts  to  be 
welded.  Most  obvious,  perhaps,  are  the 
tolerances  designated  in  the  engineering 
specifications.  These  describe  the  allow- 
able deviation  from  the  specified  loca- 
tion of  the  paits  relative  to  each  other. 
Tolerances  of  one-half  and  one  milli- 
nutei  ,ne  common  in  the  radio  tube  in- 
dustry.  Operators  do  not  work  directl) 
from  engineering  specifications.  Instead, 
throughout  the  training  period,  the  su- 
pervisoi  endeavors  to  narrow  down  the 
range  of  the  trainee's  variations  until  the) 
fall  within  acceptable  limits. 


6 


I.OllS    VINCI  N  I     SIRCKN  1 


lln  available  Hat  surfaces  at  the  end 
of  the  electrodes  which  arc  in  contact 
with  the  parts  to  he  welded  (Figure  0 
limit  tlir  operator's  freedom  in  position- 
ing the  pai  ts,  in  a  number  of  ways.  Sim  e 
tin.'  two  available  flat  areas  are  not  gen- 
erally of  the  same  si/e.  the  operator  must 
position  the  parts  so  that  the  smaller  sur- 
face will  he  located  properly  on  the  work 
at  the  moment  of  welding.  In  the  school 
the  movable  electrode  consisted  of  a 
cylindrical  welding  rod.  filed  to  a  taper 
at  one  c\u\.  leaving  an  available  flat  con- 
tact surface  approximately  0.07"  in  di- 
ameter. The  parts  had  to  be  positioned 
on  the  bottom  electrode  in  such  a  way 
that  this  surface  at  the  end  of  the  mov- 
able electrode  would  bear  properly  on 
the  parts  when  the  right  pedal  was  de- 
pressed. 

There  are  two  additional  factors  which 
complicate  the  proper  positioning  of  the 
parts  between  the  electrodes.  Since  the 
movable  electrode  (Figure  1)  is  held  by 
the  swinging  arm  (D)  which  rotates 
about  the  shaft  (M),  the  path  of  the  flat 
contact  surface  at  the  end  of  the  electrode 
is  not  a  straight  line.  However,  when 
the  movable  electrode  begins  to  exert 
pressure  on  the  parts  to  be  welded,  its 
contact  surface  must  be  parallel  to  the 
contact  surface  of  the  stationary  elec- 
trode, to  avoid  indentation  or  burning  of 
the  parts.  Moreover,  from  the  moment 
the  movable  electrode  touches  the  work, 
the  arc-segment  which  describes  its  path 
toward  the  stationary  electrode  must  be 
approximately  a  straight  line,  perpendi- 
cular to  the  stationary  contact  surface.  If 
this  were  not  the  case,  there  would  be 
slippage  between  the  two  contact  sur- 
faces, causing  the  parts  to  be  welded  to 
move  relative  to  each  other.  If  this  oc- 
curred and  were  resisted  by  the  operator, 
deformation  of  the  work  would  result. 

Furthermore,  although  the  contact  sur- 


faces must  remain  parallel  to  each  other 
while  in  contact  with  the  work,  it  is  not 
necessary  that  they  lie  in  horizontal 
planes,  as  shown  in  Figure  1.  Frequent- 
ly, the  geometry  of  the  parts  to  be  welded, 
the  need  lot  providing  operators  with 
adequate  visual  cues  and  the  require- 
ments governing  the  movement  of  the 
two  contact  surfaces  relative  to  each 
other  while  hearing  on  the  work,  dictate 
filing  the  parallel  contact  surfaces  in  a 
plane  sloping  toward  the  operator.  Both 
horizontal  and  sloped  contact  surfaces 
were  used  in  the  school. 

Translating  these  conditions  into  geo- 
metric  requirements  which  must  be  met 
by  the  operator  in  positioning  the  parts 
between  the  two  contact  surfaces,  we  have 
the  following: 

a.  The  two  parts  to  he  welded  must  lie 
in  planes  parallel  to  the  planes  of  the 
two  contact  surfaces. 

b.  The  two  parts  must  be  placed  relative 
to  each  other  in  such  way  that  the  com- 
pressive forces  exerted  by  the  two  paral- 
lel contact  surfaces  do  not  cause  move- 
ment of  the  parts  relative  to  each  other. 

These  general  statements  will  be  clari- 
fied by  an  examination  of  the  three  types 
of  welds  made  in  the  Vestibule  Training 
School  and  common  in  tube  assembly 
operations. 

3.   Three  Types  of  Welds. 

Figure  2  shows  the  three  types  of  welds 
made  in  the  school.  For  convenient  refer- 
ence, these  are  designated  by  the  letters 
A,  B,  and  C.  In  the  first  sketch  of  each 
series,  the  parts  (X)  and  (Y)  are  shown 
welded  together,  with  a  small  "x"  mark- 
ing the  location  of  each  weld.  At  the 
right,  the  parts  are  drawn  properly  po- 
sitioned between  the  stationary  electrode 
(A)  and  the  movable  electrode  (B). 

There  are  two  geometric  requirements 
which  govern  the  positioning  of  the  two 
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small  metal  cylinders  shown  in  Type  A,  in  such  a  way  that  a  straight  line  could 

Figure  2.   First,   the  cylinders,   (X)  and  lie  drawn  perpendicular  to  the  two  con- 

(Y),   must  lie   in   planes  parallel   to  the  tact  surfaces  at  or  near  their  respective 

planes  of  the  electrode  contact  surfaces,  centers,  which   would   intersect   the  two 

In  addition,  the  parts  must  be  positioned  lines  of  center  of  the  cylinders.   II   both 


TYPE  A.    TWO  CYLINDRICAL    WIRES   CROSSED  AT     RIGHT    ANGLES 


TYPE  B.  A    FLAT   METAL      PIECE     TO    A    CYLINDRICAL    WIRE. 


TYPE  C.    A    SMALL    CYLINDRICAL  WIRE    INSIDE   A    HOLLOW  METAL 
CYLINDER. 

Fig.  2.  Three  Types  of  Welds  Made  in  the  School. 
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requirements  are  met,  the  imaginary  line 
will  also  intersect  four  surface  elements 
of  the  cylinders;  namely,  the  two  ele- 
ments at  the  lines  of  contact  between  (X) 
and  (B),  and  (Y)  and  (A),  respectively, 
and  the  two  elements  which  cross  at  right 
angles  at  the  point  of  contact  between 
(X)  and  (V).  If  the  cylinders  and  contact 
surfaces  were  geometrically  perfect  and 
incompressible,  infinite  accuracy  would 
be  required  in  order  to  avoid  movements 
of  the  parts  with  compression.  In  prac- 
tice, however,  friction  between  the  sur- 
faces and  the  compressibility  of  the  metal 
parts  permit  some  deviations  from  the 
geometrically  deduced  requirements. 
Moreover,  the  pressure  of  the  parallel 
contact  surfaces  will  correct  for  minor 
deviations  from  proper  alignment  with- 
out damage  to  the  parts,  if  the  operator 
is  sufficientlv  skilful  to  avoid  the  exertion 
of  counteracting  forces.  Despite  these 
compensating  factors,  it  is  obvious  that 
this  type  of  weld  requires  a  high  degree 
of  precision  in  positioning  the  parts. 

In  the  Type  B  weld,  both  the  cylindri- 
cal wire  and  the  flat  metal  piece  must 
lie  in  planes  parallel  to  the  electrode 
contact  surfaces.  If  this  condition  is  met, 
and  if  the  small  area  in  which  the  parts 
are  to  be  welded  is  reasonably  well  cen- 
tered with  respect  to  the  smaller  of  the 
two  contact  surfaces,  the  parts  will  not 
move  relative  to  each  other  when  com- 
pressed. 

When  a  small  cylinder  is  welded  in- 
side a  larger,  hollow  metal  cylinder 
(Type  C,  Figure  2),  it  is  the  line  of  con- 
tact formed  by  the  two  contiguous  ele- 
ments of  the  cylinders  which  must  lie  in 
a  plane  parallel  to  the  two  electrode  sur- 
faces. In  addition,  an  imaginary  plane 
passed  through  the  lines  of  centers  of  the 
cylinders,  must  be  perpendicular  to  the 
electrode  contact  surfaces,  and  intersect 
the  smaller  one  at  or  near  the  center. 


The  sketch  for  this  type  of  weld  shows 
that  two  welds  are  required  to  join  the 
parts  securely.  The  two  methods  of  posi- 
tioning the  parts  are  drawn  as  they  would 
appear  to  an  observer  seated  in  the  opera- 
tor's chair.  In  making  the  weld  nearest 
the  top,  the  large  cylinder  is  placed  on 
the   stationary  electrode   and   the   small 
cylinder    is    inserted    with    the    tweezers 
through  the  opening  in  the  bottom  of 
the  large  cylinder.  To  enable  the  mova- 
ble electrode  to  clear  the  top  of  the  large 
cylinder  (cut  away  in  the  sketch)  and  to 
provide  the  operator  with  the  visual  cues 
required   for   positioning,   the   electrode 
contact    surfaces    must    lie    in    parallel 
planes   sloped   toward   the   operator   (p. 
4).   The  second   weld,   however,   is   too 
far  below  the  top  of  the  cylinder  to  be 
made  in  this  position.  Instead,  the  parts 
are  removed   from   the  electrodes  upon 
completion  of  the  first  weld.  The  cylin- 
ders are  then  rotated  180  degrees  about 
the  center  line  of  the  large  cylinder  and 
slipped  over  the  bottom  electrode  so  that 
the  small  cylinder  (X)  rests  on  the  sta- 
tionary   contact    surface,    with    cylinder 
(Y)   above.  The  third  drawing  for  the 
Type  C  weld,  shows  the  parts  in  this  posi- 
tion. 

While  the  geometric  requirements  are 
the  same  for  both  welds  (Type  C),  the 
perceptual  cues  are  somewhat  different. 
In  the  first  weld  the  alignment  of  the 
parts  is  perceived  visually.  Whereas,  with 
the  second  weld,  the  small  cylinder  must 
be  positioned  on  the  stationary  electrode 
partly  by  "feel"  and  partly  by  the  visual 
cues  provided  by  the  mark  left  on  the 
outside  of  the  cylinder  by  the  first  weld, 
and  by  the  overall  shape  of  the  large 
cylinder. 

When  one  adds  to  those  geometric 
limitations,  the  careful  handling  re- 
quired by  the  delicacy  of  the  parts  and 
the  need  for  rigorous  adherence  to  engi- 
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neering  specifications  and  tolerances,  it 
is  obvious  that  the  positioning  of  the 
parts  foi  welding  demands  a  high  degree 
of  control  over  the  movements  of  the 

hands,  fingers  and  tweezers. 

p   The  Assemblies  Produced. 

All  operators  began  their  training  with 
a  period  of  instruction  and  supervised 
practice  with  scrap  wire.  This  enabled 
them  to  acquire  certain  basic  techniques 
and  to  learn  something  of  the  funda- 
mentals  of  resistance  welding  without  the 
l<  ai  which  might  have  attended  the  in- 
itial use  of  more  delicate  good  parts.  The 
operation  consisted  in  joining  two 
lengths  of  40  mil  (.040"  diameter),  hard- 
ened nickel  wire,  crossed  at  right  angles 
to  their  respective  midpoints.  Since  the 
wires  were  collected  as  scrap  metal  from 
a  trimming  operation  in  the  factory,  they 
varied  in  length  from  0.75"  to  1.50".  Two 
pedals  were  used  in  making  the  welds 
and  the  requirements  for  positioning  the 
parts  between  the  electrodes  were  those 
stipulated  in  the  discussion  of  the  Type 
A  weld  (Figure  2). 

During  the  remainder  of  the  training 
period,  three  types  of  assemblies  were 
produced  for  use  in  completed  vacuum 
tubes.  These  will  not  be  described  in  de- 
tail, except  to  say  that  all  three  types  of 
welds  were  represented.  Two  of  the  as- 
semblies involved  the  use  of  both  pedals, 
whereas  the  third  required  a  combina- 
tion of  one  and  two  pedal  welds.  Both 
horizontal  and  sloped  electrode  contact 
mm  laces  were  employed.  Since  the  parts 
produced  were  destined  for  use  in  com- 
pleted  vacuum  tubes,  adherence  to  all 
engineering  specifications  was  impera- 
tive. Tolerances  of  one-half  and  one  mil- 
limeter were  common.  For  reference  pur- 
poses, the  three  assemblies  will  be  desig- 
nated by  the  letters,  L,  M,  and  N. 

After  a  period  of  instruction  and  su- 


pervised practice  with  the  first  assembly, 
the  trainee  was  given  a  series  ol  produc- 
tion tests  of  one  hour  duration.  At  the 
conclusion  of  each  test,  the  numbei  ol 
units  produced  was  recorded  and  the 
work  inspectc  d.  When  a  satisfai  toi  \  level 
of  skill  was  demonstrated,  the  next  op<  1  a- 
tion  was  similarly  administered. 

Inspection  of  the  completed  assemblies 
provided  the  instructors  with  several 
bases  for  judging  the  trainee's  aptitude 
for  the  work:  (a)  The  trainee's  capacity 
for  rapid,  precise  positioning  of  pans 
with  respect  to  both  the  electrode  comae  t 
surfaces  and  to  each  other  was  judged  by 
(i)  the  amount  of  production  during  the 
one  hour  production  tests,  (ii)  burning  in- 
dentation or  other  deformation  of  the 
parts,  and  (iii)  the  extent  of  deviations  of 
the  parts  from  their  specified  location 
with  respect  to  each  other,  (b)  The  proper 
use  of  pedals  was  indicated  by  the-  ab- 
sence of  "burned"  or  "cold"  welds.  Ob- 
servation of  the  operator  during  instruc- 
tion and  while  at  work  provided  addi- 
tional information.  In  this  way,  the  in- 
structors endeavored  to  attain  two  major 
objectives,  namely,  the  training  of  inex- 
perienced mounters  and  the  measure- 
ment   of    their    aptitude    for    the    job. 

B.  THE  APTITUDES  REQUIRED 

The  isolation  and  measurement  of  spe- 
cific human  abilities  have  not  reached 
the  point  of  development  where  the  apti- 
tudes involved  in  these  jobs  can  be  in- 
ventoried and  catalogued  unambiguous- 
lv.  If  the  situation  were  otherwise,  there 
would  be  little  need  for  the  experimental 
validation  of  the  tests.  Therefore,  even 
though  job  analyses  are  frequently 
couched  in  apparently  spec  ific  terms  such 
as  "finger  dexterity,"  "foot-eye-hand  co- 
oidination,"  and  the  like,  those  terms 
and  statements  involving  them  must  be 
regarded   for   the   most   part   as  desciip 
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tive  ol  what  the  operator  does.  Highly 
generalized,  phenotypical  descriptions 
have  man)  legitimate  uses  in  the  em- 
ployment office  and  elsewhere.  In  order 
to  l)t"  operationally  meaningful  in  desig- 
nating specific  aptitudes,  however,  it 
would  be  necessary  that  each  of  the  terms 
refer  to  a  specific  set  of  operations  (i.e.. 
to  a  particular  method,  technique  or  test) 
by  which  the  designated  aptitudes  could 
he-  identified  and  measured,  and  that  the 
same  set  of  operations  could  be  used  to 
select  applicants  for  any  job  in  which  a 
high  degree  of  the  characteristic  were 
found.  For  example,  it  would  be  neces- 
sary either  to  demonstrate  that  "finger 
dexterity'"  refers  to  the  same  aptitude  in 
watchmaking,  mount  assembly,  typing  or 
playing  the  piano,  or  to  develop  sub- 
classes of  the  concept  which  could  be  ap- 
plied with  adequate  specificity. 

These  requirements  are  only  partially 
fulfilled.  For,  while  such  terms  do  not 
generally  refer  to  a  specific  set  of  opera- 
tions, they  do,  in  some  cases,  point  to  a 
broad  class  of  operations  which  have 
been  found  by  factor  analysis  and  cor- 
relation studies  to  be  relatively  distinct. 
To  the  extent  that  this  is  true,  they  serve 
to  delimit  the  search  for  specific  tests  and 
techniques  which  are  likely  to  correlate 
satisfactorily  with  job  performance. 

With  these  limitations  in  mind,  the 
job  was  analyzed  in  accordance  with  the 
schedule  developed  by  the  War  Man- 
power Commission  (24),  which  rates  each 
of  47  "worker  characteristics"  on  a  four 
point  scale.  Characteristics  receiving 
either  of  the  two  highest  ratings,  A  or  B, 
are  listed  below.3 

Working  rapidly  for  long  periods  (B) 

Dexterity  of  the  fingers  (B) 

Dexterity  of  hands  and  arms  (B) 

3  The  author  is  indebted  to  Mr.  L.  A.  Kameen, 
experienced  job  analyst,  for  checking  the  analy- 
sis for  technical  accuracy. 


Eye-hand  coordination  (A) 

Foot-eve  hand  coordination  (B) 
Coordination  of  independent  movements  of 

both    hands  (A) 

Estimate  size  of  objects  (B) 

Perceive  form  ol  objects  (B) 

keenness  of  vision  (A) 

Muscular  discrimination  (B) 

Estimating  quality  of. objects  (B) 

The  present  experiment  is  concerned 
primarily  with  the  validation  of  tests  re- 
lated to  dexterity  of  the  fingers,  hands 
and  arms,  eve-hand  coordination  and  the 
coordination  of  independent  movements 
of  both  hands. 

C.   PRODUCTION  DATA  EXAMINED  AS 
POSSIBLE  CRITERIA 

The  possibility  of  using  the  production 
test  scores  (p.  9)  as  an  independent  meas- 
ure of  the  ability  of  each  trainee  to  per- 
form these  delicate  operations  was  fully 
explored. 

The  reliability  coefficients  for  scores, 
based  upon  one,  two,  three  and  four  suc- 
cessive production  tests  on  each  assem- 
bly are  shown  in  Table  1.  The  itali- 
cized figures  represent  the  calculated 
product  moment  correlations.  For  exam- 

Table  i 

Reliability   of   Scores   Based    Upon   One,   Two, 

Three  and  Four  Successive  Production 

Tests  on  Each  Assembly 


Operations 


N 


Number  of  Production 
Tests 


Assembly-L  203 
Assembly-M  200 
Assembly-N       200 


.807  .S93  .926  .943 
.702  .825  .876  .904 
.727     .842      .888      .914 


pie,  the  index  for  two  trials  on  assembly 
L  (.893)  was  determined  by  correlating 
the  total  number  of  units  produced  dur- 
ing the  first  two  production  tests  with  a 
similar  score  for  the  third  and  fourth 
tests.    The    italicized    coefficient    for   as- 


TESTS    USED    FOR    THE    SELECTION    OF    RADIO    TUBE    M  <  M   NTERS 


I  1 


sembly   M   (-825)  was  computed   in   the 

same  way.  Intcicoi relation  of  the  number 
of  units  Fabricated  dining  the  first  and 
second  tests  on  assembly  X.  yielded  a 
coefficient  of  .727.  the  reliability  of  the 
scores  on  one  production  test.  The  re- 
maining indices  were  computed  with 
the  Spearman-Brown  prophecy  formula. 
These  data  were  obtained  by  selecting  at 
random  approximately  two  hundred  rec- 
ords from  the  many  hundreds  who  were 
trained  in  the  school  during  the  first  ten 
months  of  the  year,  1943.  This  was  done 
in  order  that  the  results  would  be  ap- 
plicable to  all  the  experimental  data 
collected  during  that  period.  Since  not 
all  trainees  had  the  required  number  of 
production  tests  on  each  operation,  the 
sampling  process  was  repeated  for  each 
coefficient. 

The  results  indicate  that  the  produc- 
tion tests  have  attained  a  degree  of  re- 
liability comparable  to  that  of  many  em- 
ployment tests.  One  or  two  trials  on  each 
assembly  would  provide  a  score  sufficient- 
lv  reliable  for  mass  testing  purposes. 
There  were,  however,  several  factors 
which  limited  the  value  of  production 
data  as  criteria. 

1.  Unequal  amounts  of  practice  before  be- 
ginning the  first  production  test  on  a  job. 
Although  a  given  group  of  trainees,  enter- 
ing the  school  on  the  same  day,  generally 
followed  the  same  schedule,  there  were  varia- 
tions among  the  different  groups  processed 
over  a  period.  Moreover,  trainees  whose  per- 
Eormance  failed  to  meet  certain  minimum 
requirements  were  frequently  given  addi- 
tional practice  before  taking  the  first  test. 

2.  Variations  in  the  number  of  production 
tests  given  on  each  job.  Production  tests 
were  continued  with  each  assembly  until  the 
trainee  achieved  a  satisfactory  level  of  pro- 
duction and  quality.  Thus  the  better  opera 
tors  generally  had  less  experience  in  the  Inst 
welding  operation  before  undertaking  the 
second  and  third  assemblies.  The  numbei  <>( 
production  tests  administered  also  depended 
upon   the  relative   number  of  each   type  of 


assembly  required  In  the  factor)  over  a  pe- 
riod of  time  as  well  as  upon  the  availability 
of  materials. 

3.  It  was  not  always  possible  to  adhen 
the  planned  sequence  ol   jobs.  F01  example, 
it  was  sometimes  necessary  to  begin  a  trainei 
on  assembly   M   without  the   benefit   ol    ex- 
perience with  assembh  I  . 

4.  Variations  in  quality  and  workmanship. 
Since  speed  of  production  depends  to  1  hast 
some  extent  upon  the  amount  ol  attention 
given  to  quality  and  workmanship,  raw  pro- 
duction test  scoies  are  not.  in  themselves, 
precise  measures  of  differential  ability. 

5.  Although  every  effort  was  made  to  avoid 
interruptions  during  production  tests,  it  was 
frequently  necessary  to  correct  errors  in  per- 
formance as  they  appeared,  in  order  to  pre- 
vent their  fixation.  Also,  occasional  machine 
and  material  difficulties  are  inevitable  ovei  an 
experimental  period  of  several  months. 

To  the  extent  that  the  first  three  fac- 
tors operated,  trainees  were  tested  at  dif- 
ferent points  on  the  learning  curve. 
While  these  variations  may  be  expected 
to  have  little  effect  upon  the  reliability 
with  which  the  trainee's  skill  at  the  time 
of  testing  is  measured,  they  would  tend  to 
reduce  the  intercorrelations  anion"   the 

O 

scores  for  the  three  assemblies,  as  well  as 
their  suitability  as  criteria. 

Variations  in  quality,  on  the  other 
hand,  would  reduce  the  reliability  .co- 
efficient for  any  test  if  they  occurred  from 
trial  to  trial  for  the  same  individual.  Dif- 
ferences in  quality  from  job  to  job  lor 
the  same  individual  would  reduce  inter- 
assembly  correlations.  Am  variance  in 
quality  among  individuals  would  reduce 
the  validity  of  the  scores  as  criteria. 

The  occui  1  cue e  ol  inter)  upl ions  would 
tend  to  reduce  all  correlations  from  their 
true  values. 

While  one  mighl  wish  that  these 
sources  of  variation  had  been  controlled, 
the  fact  that  the  techniques  and  proce- 
dures themselves  have  been  demonstrated 
to  he  reliable  measures  ol  the  abilities 
required  to  perform  each  ol  these  jobs,  is 
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.i  strong  argument  for  the  validity  of  the 
instructors'  judgments  based  upon  these 
scores.  II  the  techniques  had  been  found 
to  be  unreliable,  the  validity  of  their 
judgments  would  be  open  to  serious  ques- 
tion. Given  reliable  techniques,  however, 
it  was  possible  lor  the  instructors  to  cor- 
rect for  variations  whieh  attended  the 
testing  of  individual  trainees.  To  have 
accomplished  this  with  a  high  degree  of 
precision,  it  would  have  been  necessary 
to  develop  a  set  of  norms  for  each  par- 
ticular set  of  circumstances.  Since  multi- 
ple norms  were  not  available,  it  would  be 


naive  to  pretend  that  the  corrections  wei  e 
applied  without  error.  However,  the  in- 
structors had  had  opportunity  to  observe 
and  judge  many  hundreds  of  trainees 
under  similar  conditions,  before  the  pres- 
ent experiment  was  begun.  It  is  reason- 
able to  assume,  therefore,  that  their 
pooled  judgment,  based  upon  several 
days  of  constant  observation  of  trainee 
performance,  would  constitute  a  better 
criterion  than  the  uncorrected  produc- 
tion test  scores  themselves. 

The  criterion  finally  selected  will  be 
described  in  the  following  chapter. 


III.  The  Correlated  Variables 

The  validation  of  aptitude  tests  con-  (a)  Verbal  definitions.  Two  separate 
sists  essentially  in  determining  the  written  definitions  were  solicited  from 
degree  of  relationship  between  two  sets  of  the  supervisor  of  the  school,  the  first 
variables,  namely,  the  criterion  of  job  during  the  course  of  a  preliminary  ex- 
performance  and  the  several  aptitude  periment  and  the  second  more  than  six 
test  scores.  months    later,    immediate  h    before    the 

present  study.  Both  statements  agree  that 

A.  the  criterion  three  factors  must  be  considered  in  jud.L;- 

The  criterion  was  a  single  overall  rat-  ing    a    trainee's    probable    success    as    a 

ing,  based  upon  the  pooled  judgment  of  mounter:   (i)  ability  to  learn,  indicated 

the  supervisor  of  the  school  and  at  least  by  the  speed  with  which  instructions  am 

two  instructors,  expressed  on  a  five-point  grasped  and  the  extent  to  which   they 

scale.  The  procedures  in  the  school  were  were  retained;   (ii)  dexterity,  judged  by 

such    that   each   instructor   had   the   op-  ability  shown  in  handling  tweezers  and 

portunity  to  teach  every  trainee  at  least  small  parts  and  by  the  speed  and  preci- 

one  of  the  assemblies  and  to  administer  sion  demonstrated  in  performance  of  the 

the  required  production   tests.   The  su-  required   operations;   and   (hi)   attitude, 

pervisor  of  the  school  followed  closely  based  upon  considerations  such  as  will- 

the  progress  of  each  trainee  by  repeated  ingness  to  work,  industry,  patience,  reac- 

observations  of  performance,  by  assisting  tion  to  criticisms,  attitude  toward  quality 

the  instructors  in  teaching  the  operations  and  conscientiousness, 

and  by  checking  the  reported  scores  on  Of  course  tests  of  manipulative  apti- 

the   production   tests  as  well  as  miscel-  tudes  can  hardly  be  expected  to  predict 

laneous  notations  on  the  record  sheet.  In  the    types   of   behavior   which   were   in- 

addition,  an  effort  was  made  to  size  up  eluded  in  the  concept  of  "attitude."  Nor 

the  trainee's  probable  adjustment  to  and  were  they  designed  to  measure  the  abilit\ 

altitude  toward   the  work,  by  means  of  to  follow  and  retain  verbal  instructions, 

many  informal  contacts  and  one  planned  A  criterion  excluding  these  factors,  how- 

interview.  ever,  would  fail  to  provide  an  index  of 

Upon  completion  of  the  training,  the  the  importance  of  the  aptitudes  measured 
supervisor,  together  with  the  instructors,  by  the  tests  in  the  total  contellation  of 
reviewed  the  trainee's  performance  and  factors  which  make  for  job  success.  For 
general  fitness  for  the  occuption.  The  this  reason,  the  overall  ratings  were  pie- 
overall  rating  was  arrived  at  through  dis-  ferred  to  judgments  based  on  manipula- 
cussion  and  represented  the  consensus  of  tive  abilities  alone. 

the  group.  (b)  Statistically  determined  definitions. 

Two  statistical  studies  were  made  which 

1.   The  Meaning  of  The  Overall  Rat-  served  to  determine  at  least  partially  the 

ing  Continuum.  meaning  or  content  of  the  overall  latin 

Since  the  factors  which  make  up  this  The  first  was  a   In  product  of  an  earlier 

composite  rating  determine  entirely  the  experiment  and  shows  the  relationship 

usefulness  of  the  correlations  obtained  between  judgments  of  quantity  and  qual- 

with  the  tests,  the  meaning  of  the  con-  itv  and  an  overall  evaluation.  In  the  v 

tinuum  will  be  explored.  ond,  the  ratings  are  correlated  with  raw 

*3 
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production    test    scores   lor   each   of   the 
three  assemblies. 

(i)  In  order  to  obtain  criteria  lor  the 
lust  -,i  trainees  tested  in  the  series  of  e\ 
periments  involving  the  school,  the  rank 
ing  method  was  employed.1  When  the 
-,  ist  employee  had  completed  the  train- 
ing, a  set  of  cards  bearing  only  the  names 
of  the  employees  was  ranked  according 
to  overall  fitness  for  the  occupation.  One 
week  later,  two  new  sets  of  51  cards  each, 
for  the  same  employees,  were  ranked, 
respectively,  for  the  quality  and  for  the 
quantity  of  the  work  produced  during 
train  ing.  The  rank-difference  correlations 
(p)  between  the  three  sets  of  ranks  are 
shown  in  Table  2.  The  denominator  of 
the  corresponding  t  ratios  was  calculated 
by  setting  p  equal  to  zero  in  the  formula 
for  the  standard  error  of  that  statistic. 
At  the  1  %  level,  with  49  degrees  of  free- 
dom, t  equals  2.68.  Thus  all  intercor- 
relations  are  highly  significant. 


Table  2 

Intercorrelations  Between  Qi 
and  Overall  Ran 

(N  =  5i) 

lality, 
ks     * 

Quantity 

Ranks 
Correlated 

P 

tp 

Quality-Quantity 

Quality-Overall 

Quantity-Overall 

.665 
.740 
.852 

4-54 
5-°5 
5.8i 

The  correlation  of  .665  between  the 
quality  and  quantity  ranks  indicates  a 
substantial  amount  of  overlap  between 
the  judgments  of  these  two  factors. 
Granting  validity  to  the  respective  judg- 
ments,  this  would  mean  .that  the  faster 


4  The  rank  method  was  preferred  to  rating  in 
this  preliminary  experiment  because  it  was 
thought  to  involve  a  simpler  and  more  precise 
judgment  process.  The  results  were  employed, 
along  with  discussions  of  the  nature  and  dis- 
tribution of  individual  differences,  as  a  means 
of  training  the  supervisors  in  making  the  ratings 
which  served  as  a  criterion  for  evaluating  all  tests 
given  subsequently  to  prospective  mounters. 


operators  tend  to  produce  work  of  higher 
quality,  whereas  the  slower  operators 
tend  to  produce  work  of  poorer  quality. 

The  remaining  two  coefficients  indi- 
cate that  the  separate  judgments  of  qual- 
ity and  quantity  are  important  compo- 
nents of  the  overall  judgment.  The  some- 
what greater  weight  given  to  quantity  of 
work  may  be  explained  in  part,  at  least, 
by  the  circumstances  attending  the  rank- 
ing. The  51  subjects  had  completed  their 
training  during  the  three  week  period 
immediately  preceding  the  overall  rank- 
ing. An  interval  of  between  one  and  four 
weeks  elapsed  before  the  quality  and 
quantity  rankings  were  made.  It  was 
necessary,  therefore,  for  the  supervisors 
of  the  school  and  the  instructors  to  de- 
pend upon  their  written  records  of  per- 
formance. Quantity  of  production  on 
each  of  the  jobs,  together  with  notations 
referring  to  extraneous  factors  (e.g.,  ma- 
chine trouble)  affecting  it,  were  recorded 
more  precisely  than  characteristics  such 
as  workmanship  and  attitude.  It  would 
not  be  surprising,  therefore,  if  quantity 
of  work  was  given  more  wreight  in  these 
belated  rankings  than  it  would  have  re- 
ceived if  the  judgments  were  made  im 
mediately  after  graduation. 

These  data  are  important  to  the  pres- 
ent experiment  in  several  ways.  They 
demonstrate  that  the  pooled  judgments 
of  the  supervisor  and  the  instructors  are 
capable  of  high  reliability,  and,  that 
quantity  and  quality  of  work  are  impor- 
tant components  of  the  overall  judg- 
ments. Since  the  records  were  consulted 
throughout,  the  results  lend  support  to 
the  assertion  that  reliable  judgments  can 
be  made  on  the  basis  of  the  production 
test  scores,  the  number  of  defective  units 
produced  and  notations  referring  to 
workmanship,  attitude  and  disturbing 
condi  tions.5 


5  It  cannot  be  stated,  however,  that  the  judg- 
ments were  based  solely  on  the  records.  Since  the 
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iii)  The  extent  to  which  the  overall 
ratings  arc  based  upon  the  magnitude 
of  the  production  test  scores  is  indicated 
by  the  correlation  coefficients  in  Table 
3.  The  raw  data  were  sele<  ted  from  the 
hies  in  the  same  way,  and  for  the  same 
period,  as  those  used  in  computing  the 
reliability  of  the  respective  assemblies. 
The  raw  scores  on  each  assembly  were 
comprised  of  the  total  number  of  units 
produced  during  the  first  two  production 
tests. 

Table  3 

Correlations  Between  Overall  Ratings  and  Haw 
Production  Test  Scores 


Operation 


N 


Tbi 


°>bis-o 


Vssembly-L 

\--emblv-.\I 
Assemhh  -X 


192 
208 
200 


.646  .0987  6. ss 
.603  .0904  6.67 
.525        .0928       5.66 


For  reasons  to  be  discussed  in  Chapter 
V.  the  rating  distributions  were  divided 
dichotomously  and  the  coefficients  were 
calculated  biserially.  The  denominator 
of  the  t  ratio  was  computed  by  setting 
rbis  equal  to  zero  in  the  formula  for  the 
standard  error  of  the  biserial  r  (14).  Since 
t  values  of  2.61  and  2.60  are  significant  at 
the  1%  level  for  150  and  200  degrees  of 
freedom,  respectively,  all  coefficients  are 
highly  significant. 

It  was  noted  earlier  that  the  super- 
visors endeavored  to  correct  the  raw 
scores  for  the  extraneous  factors  affecting 
them.  Having  demonstrated  the  reliabil- 
ity of  the  supervisors'  judgments,  based 
on  production  and  other  records,  the  co- 


individual  trainees  were  once  known  to  the  in- 
senators,  it  is  probable  that  memon  <  I  their 
performance  played  a  part  in  determining  both 
sets  of  rankings.  However,  with  respect  to  the 
reliability  of  the  rankings,  it  is  neces'ar]  only 
thai  the  spurious  effect  of  memory  (if  the  first 
ranking  be  eliminated.  It  appears  thai  this  factor 
was  adequatel)  controlled  in  the  experimental 
design,  by  the  Dumber  <>f  subjects,  the  time  int<  1 
vals,  and  the  fact  that  the  judges  were  led  t<> 
believe  that  the  overall  rankings  completed  the 
task. 


efficients  may  be  interpreted  as  repre- 
senting the  minimum  correlation  be- 
tween quantity  of  production  on  each 
of  the  jobs  and  the  overall  ratings. 
Ordinal  ily,  the  multiple  t  oi  relation  <  o- 
efficient  would  be  calculated  to  determine 
the  extent  to  which  the  overall  ratings 
were  based  upon  a  composite  ol  these 
scores.  However,  since  all  zero-order  co- 
efficients were  distorted  by  uncontrolled 
conditions  (pp.  Ill)  this  statistic  could 
not  be  taken  as  a  good  estimate  oi  the 
amount  of  relationship.  There  is  little 
doubt  that  the  true  multiple  R,  il  known, 
would  be- greater  than  the  highest  zero- 
order  coefficient  in  Table  3. 

In  spite  of  this,  the  data  further  sub- 
stantiates one  of  the  conclusions  rea<  hed 
earlier,  namely,  that  quantity  of  produc- 
tion is  an  important  component  of  the 
overall  ratings. 

2.   The  Steps  of  The  Rating  Scale. 

The  rating  scale  consisted  of  the  fi\ 
grades,  Poor,  Fair,  Average,  Good  and 
Very  Good,  which  were  selected  by  the 
supervisors  during  the  discussions  in 
which  the  procedures  were  developed. 
The  overall  rating  received  by  a  trainee 
was  communicated  to  the  experime'ntei 
as  an  abbreviation  of  the  appropriate 
adjective,  recorded  on  a  card  bearing  the 
employee's  name,  the  number  ol  units 
produced  during  each  production  test. 
and  miscellaneous  observations  and  com- 
ments. While  each  of  the  adjectives  was 
verbally  defined,  their  essential  meaning 
for  this  experiment  will  be  derived  (Ch. 
V,  Section  A)  from  an  analysis  ol  the  dis- 
ii  ibution  and  from  the  answei  to  a  s<  1 1.  - 
ol  questions,  designed  to  separate  "satis 
l;u  tory"  from  "unsatisfactory"  employees. 

B.     I  III     Al'l  I  I  I   1)1      i  I  Ms 

Five  manipulation  tests  were  adminis 

il    to   .ill    applit  ants    inc  lud(  d    in    this 
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study.  Although  a  standard  intelligence 
list  was  also  given  when  time  permitted, 
a  check  on  the  mean  and  spread  of  the 
scores  showed  that  it  was  not  discriminat- 
ing sufficient!)  among  the  lower  scoring 
members  of  the  applicant  group.  Since 
the  selection  anil  validation  of  a  more 
suitable  measure  of  intelligence  did  not 
coincide  with  the  defined  boundaries  of 
the  present  experiment,  these  data  will 
not  be  presented. 

1.  Procedures  for  Administration. 

The  procedures  for  administration  and 
scoring  were  adapted  to  the  requirements 
of  the  mass  testing  situation  in  which  the 
results  were  to  be  applied.  The  revised 
procedures  are  described  below.  The  tests 
were  administered  in  the  order  listed. 

(a)  Minnesota  Rate  of  Manipulation,  Plac- 
ing. The  standard  instructions  (17)  advise  the 
subject  to  "Push  the  board  away  from  you, 
leaving  a  clear  space  about  twelve  inches 
wide  in  front  of  you."  In  order  to  control 
the  distance  of  the  hand  movements  and  the 
alignment  of  the  blocks  in  relation  to  the 
empty  board,  four  wooden  angles  were  se- 
cured to  the  table.  In  this  way,  a  constant 
distance  of  ten-and-five-eighth  inches  was  sub- 
stituted for  the  somewhat  variable  twelve 
inch  distance  recommended  in  the  instruc- 
tions. The  basic  motion  pattern  remained  the 
same.6 

In  contrast  with  the  standard  instructions, 
the  test  was  administered  on  a  time  limit 
basis.  After  one  complete  trial  for  practice, 
four    forty-second    trials    were    given.    The 

6  This  arrangement  was  found  satisfactory  for 
the  period  in  which  the  placing  and  turning  tests 
were  administered  experimentally  to  all  appli- 
cants, and  when  space  permitted  the  permanent 
storage  of  the  tests  on  the  tables.  It  was  modified, 
however,  when  these  tests  were  found  valid  for 
only  a  few  occupations.  According  to  the  new 
method,  the  tests  are  stored  in  a  cabinet  on  re- 
movable, plywood  shelves,  with  a  5/16"  high, 
square,  wooden  strip  along  each  of  the  four 
edges.  The  strips  along  the  two  side  edges  are  cut 
away  over  a  length  of  lour  inches  to  permit  lilt- 
ing the  empty  test  board  after  it  is  used  to  posi- 
tion the  blocks  for  placing.  All  dimensions  are 
such  as  to  preserve  the  original  conditions  in 
which  the  test  was  standardized. 


score  consisted  of  the  total  number  of  blocks 
placed  during  the  four  trials.  After  each  trial, 
the  subjects  were  instructed  to  plate  the  re- 
maining blocks  in  the  holes,  using  both 
hands.  The  starling  directions,  which  are  im- 
portant inoi  nationally,  were  as  follows: 
"Now  you  want  to  see  how  many  you  can 
do  in  just  40  seconds.  Take  hold  of  the  bot- 
tom block  with  your  right  hand.  Get  ready. 
Go."  Before  the  second  trial,  the  need  for 
repetition  was  explained  as  a  means  of  giv- 
ing everyone  a  fair  chance.  This  was  found 
necessary  in  order  to  maintain  cooperation 
throughout  and  to  avoid  the  "What,  again?" 
attitude  which  had  been  expressed  fre- 
quently. 

(b)  Minnesota  Rate  of  Manipulation, 
Turning.  Except  for  the  substitution  of  the 
time  limit  for  the  work  limit  method,  this 
test  was  administered  in  the  standard  way 
(19).  After  one  complete  trial  for  practice, 
four  thirty-second  trials  were  given.  The 
total  number  of  blocks  turned  during  the 
four  trials  comprised  the  score,  After  each 
trial,  the  subjects  were  instructed  to  com- 
plete the  board  exactly  as  during  the  test. 
The  send-off  directions  were  as  follows:  "Now 
you  want  to  see  how  many  blocks  you  can 
turn  over  in  just  30  seconds.  Take  hold  of  the 
first  block  in  the  upper  right  hand  corner 
with  your  left  hand.  Get  ready.  Go." 

(c)  Finger  Dexterity  Test.  The  test  equip- 
ment, described  by  O'Connor  (11)  was  used. 
According  to  the  standard  instructions  (21), 
subjects  are  directed  to  "Start  in  the  farthest 
corner  and  work  toward  you."  On  the  other 
hand,  instructions  to  the  examiner  say:  "Al- 
low the  examinee  to  place  30  pins,  thus  fill- 
ing the  top  line  of  ten  holes,  for  practice." 
Apparently  following  this  latter  instruction, 
Tiffin  (18)  presents  a  picture  of  a  subject 
filling  the  holes  from  left  to  right  along  the 
top  of  the  board.  At  the  same  time,  O'Connor 
(11)    shows    a    subject    taking    the    Tweezer 

Dexterity  Test,  for  which  the  directions  are 
the  same  in  this  respect,  who  has  filled  five 
rows  perpendicular  to  the  length  of  the  board 
and  who  appears,  moreover,  to  be  placing 
the  first  pin  in  the  sixth  row  in  the  hole 
nearest  her.  Thus  it  is  difficult  to  say  whether 
or  not  the  standard  procedure  was  followed 
in  this  experiment.  Nevertheless,  all  appli- 
cants were  instructed  to  start  in  the  upper  left 
hand  corner  and  to  work  down  the  row,  that 
is,  transverse  to  the  length  of  the  board.  All 
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rows  were  fillet!  in  the  same  way.  Since  the 
time  limit  method  was  used  and  onl)  .1  por- 
tion of  the  hoard  completed,  the  ordei  in 
which  the  holes  are  filled  is  important  be- 
cause  oi  its  effect  upon  the  total  distance 
travelled  in  the  testing  period. 

Alter  filling  the-  ten  holes  in  one  row  foi 
practice,  the  subject  was  given  two  three-and- 
one-half  minute  trials,  starting  each  with  an 
empty  board.  The  score  consisted  of  the  total 
number  ol  holes  filled  during  the  two  trials. 
1  he  time  allowed  was  stated  in  the  starting 
directions  for  this  as  for  all  tests. 

(d)  Purdue  Pegboard,7  Assembly  Test.  The 
subtest  of  the  Purdue  Pegboard  (22)  in- 
cluded in  this  study,  consists  of  the  repetitive 
assembly  of  a  pin,  a  cylindrical  collai  and 
two  washers,  and  will  be  referred  to  as  the 
Purdue  Assembly  test.  Except  for  the  fact  that 
four  complete  assemblies  were  allowed  for 
practice  in  addition  to  the  two  normally  con- 
structed in  the  course  of  the  initial  instruc- 
tions, the  standard  procedures  were  followed. 
The  test  proper  consisted  of  three  separate 
trials  of  one  minute  duration.  The  number  of 
parts  assembled  during  each  trial  was  re- 
corded and  their  total  served  as  the  final 
score. 

7  Tiffin    (18)  refers  to  this  test  as  the  "Purdue 
Dexterity  Test." 


(e)  Tweezer  Dexterity  Test.  The  standard 
test  equipment,  designed  h\  O'Connor  (11, 
23),  was  used.  Holes  were  filled  in  the  ordei 
described  for  the  Fingei  Dexterity  Test.  In 
contrast  to  the  method  pi<  tured  by  O'Connoi 
(11),  the  tweezers  were  held  somewhat  like  a 
pencil,  with  the  tip  of  the  third  finger  neai 
the  point.  The  index  and  third  fingers  were 
spread  apart  on  one  leg  of  the  tweezers,  with 
the  thumb  pressing  against  the  opposite  leg. 
A  similar  technique  is  applied  to  most  mount- 
ing operations.  The  method  ol  placing  the 
pins  was  also  covered  in  the  instructions: 
".  .  .  if  you  pick  up  the  pins  correctly  and 
keep  your  hand  and  wrist  relaxed,  you  can 
slant  the  pin  to  start  it  in  the  hole  and  then 
straighten  it  out.  .  .  .  But  be  sure  to  straighten 
it  out  before  you  let  go  of  it,  otherwise  it 
stands  up  like  this  and  that's  wrong." 

After  filling  the  ten  holes  in  the  first  row 
for  practice,  the  subject  was  given  two  three- 
minute  trials,  starting  each  with  an  empty 
hoard.  The  score  consisted  of  the  total  num- 
ber of  holes  filled  during  the  two  trials. 

Data  related  to  the  reliability  of  the 
tests  as  applied  to  the  experimental 
group  will  be  presented  in  Chapter  V. 


IV.    EXPERIMENTAl     CONDITIONS    AND    SUBJECTS 


E\i  i  pt  tor  the  administration  of  apti- 
tude  tests.  ;ill  subjects  were  proc- 
essed in  accordance  with  the  regular  se- 
lection and  placement  of  procedures  in 
effect  at  the  time.  As  applied  to  Eemale 
applicants  for  production  jobs  in  the 
factory,  these  procedures  were  essentially 
.is  follows: 

i.  All  applicants  were  required  to  fill  in  an 
application  blank. 

•_>.  1  lu\  were  then  given  a  screening  inter- 
view lor  the  purpose  of  checking  such  details 
as  completeness  of  the  application  blank, 
type  <>l  work  desired,  and  shift  preferred.  An 
effort  was  made  to  determine  in  a  general 
wa\  whether  or  not  some  type  of  available 
factory  work  was  mutually  acceptable. 

3.  The  experimental  battery  of  tests  was 
administered  in  a  cpiiet,  pleasant  room  some- 
what removed  from  the  activities  of  the  em- 
ployment office.  All  applicants  took  the  tests 
under  the  impression  that  the  results  were  to 
be  used  for  placement  purposes.  Actually,  the 
results  were  not  reported  to  the  interviewers. 
Instead,  they  were  filed  and  later  compared 
with  performance  records. 

4.  During  the  placement  interview  which 
followed,  each  member  of  this  applicant 
group  was  assigned  to  a  particular  occupation 
and  applied  against  a  specific  requisition. 

5.  Applicants  were  then  routed  to  the  dis- 
pensary for  a  medical  examination  which 
included  such  items  as  general  physical  con- 
dition, height  and  weight  measurements,  and 
a  telebinocular  test  of  vision. 

Individuals  designated  as  mounters 
were  scheduled  to  begin  training  on  the 
following  work  day  in  the  Vestibule 
Training  School.  After  three  or  four  days 
of  training,  they  were  assigned  to  spe- 
cific operations  in  the  factory,  in  accord- 
ance with  the  abilities  and  characteristics 
demonstrated  in  the  school. 

It  was  noted  earlier  that,  since  the 
research  with  mounters  covered  a  con- 
siderable period  of  time  and  was  pur- 
sued with  various  tests  and  procedures, 
the  present  study  does   not  include  all 


applicants  processed  in  ibis  way.  The 
major  reduction  in  the  size  of  the  group 
was  accomplished  by  fixing  the  dates  for 
the  beginning  and  end  of  the  experiment, 
as  May  17th  and  September  15,  1943,  re- 
spectively, during  which  time  the  tests 
and  the  procedures  for  administration 
followed  rigorously  the  pattern  previous- 
ly described. 

Application  of  the  following  criteria 
resulted  in  the  further  elimination  of 
subjects  from  this  group. 

1.  All  subjects  known  to  he  left  handed 
were  eliminated  to  avoid  the  possible  spuri- 
ous effect  of  this  factor  on  the  empirical 
correlations.  Left  handedness  was  determined 
in  two  ways,  (a)  All  subjects  were  allowed 
to  choose  the  hand  to  he  used  for  the  placing 
test.  In  the  event  of  hesitancy,  the  impor- 
tance of  using  the  faster  hand  was  empha- 
sized, and  further  opportunities  for  making 
the  choice  were  presented  during  the  instruc- 
tions on  the  finger  dexterity  and  tweezer  dex- 
terity tests.  A  question  mark  was  posted  to 
separate  those  who  vascillated  or  changed 
hands  from  those  who  w-ere  consistently  right 
handed,  (b)  The  supervisor  of  the  school 
made  frequent  notations  on  the  cards  re- 
porting production  test  scores  and  overall 
ratings,  in  order  to  assist  the  experimenter  in 
identifying  factors  which  might  affect  the 
relationships  between  tests  and  performance. 
These  notations  were  also  posted  on  the 
trainee's  card.  Although  the  method  of  deter- 
mining handedness  in  individual  cases  was 
not  recorded,  all  records  bearing  the  nota- 
tion, "L.H.,"  with  or  without  a  question 
mark,  weie  excluded. 

2.  In  order  to  avoid  recalculating  certain 
intermediate  statistics  for  each  correlation  co- 
efficient, subjects  who  did  not  have  all  five 
manipulation  tests  were  eliminated. 

3.  Individuals  who  had  previous  experi- 
ence with  the  company  were  not  included. 
Since  this  information  was  posted  from  the 
daily  hire  sheet  which  identifies  rehires  with- 
out specifying  the  previous  occupation,  it  is 
possible  that  a  number  of  the  trainees  ex- 
cluded had  no  direct  experience  as  mounters. 
However,    the    fact    that    most    tube   making 
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operations  provide  experience  in  the  manipu- 
lation of  small  paits  and,  in  some  cases, 
training  in  the  use  of  tweezers,  the  elimina- 
tion of  all  rehires  was  considered  advisable. 

4.  An  effort  was  made  to  detect  identical  or 
related  experience  in  other  companies.  This 
was  done  in  two  ways,  (a)  Subjects  were 
asked,  during  the  course  of  the  instructions 
on  the  tweezer  dexterity  test,  "Has  anyone 
used  tweezers  in  work,  before?"  (b)  Previous 
experience  in  mounting  was  one  of  the  fac- 
tors noted  on  the  report  card  from  the  Vesti- 
bule Training  School,  and  was  posted  to  the 
trainee's  file  card.  Since  all  trainees  were 
thoroughly  interviewed  in  the  school,  it  is 
reasonably  certain  that  all  experienced 
mounters  were  detected  and  eliminated. 
However,  since  applicants  were  drawn  from 
a  highly  industrialized  area,  it  is  entirely  pos- 
sible that  all  experience  helpful  to  perform- 
ance in  both  the  test  and  job  situations  was 
not  adequately  controlled.  Moreover,  while 
the  point  would  be  difficult  to  check  at  this 
late  date,  it  is  the  author's  impression  that 
no  one  answered  affirmatively  the  question 
pertaining  to  the  previous  use  of  tweezers 
who  would  not  have  been  eliminated  as  an 
experienced  mounter.  If  this  is  so,  it  raises 
the  question  of  whether,  in  view  of  the  mo- 
tives for  concealing  this  information  in  the 
test  situation,  the  answers  were  valid.  Fur- 
ther data,  pertinent  to  this  question  as  it  re- 
lates to  the  tests  of  the  final  battery,  will 
therefore  be  presented  in  a  later  section 
(Ch.  V,  Section  I). 

5.  Only  those  subjects  who  were  reported 
to  have  had  a  total  of  three  or  more  produc- 
tion tests  on  at  least  two  of  the  three  standard 


jobs  in  the  training  school  were  included. 
The  lad  tli. 11  man)  of  the  eliminated  sub- 
jects had  been  given  tests  on  other  jobs  was 
not  considered.  While  there  were  good  rea 
sons  for  requiring  production  tests  on  all 
three  jobs,  this  stricter  criterion  would  have 
been  selective  with  respeel  to  the-  abilities  in- 
volved. Poor  and  verj  poor  trainees  were 
given  considerably  more  training  and  practice 
on  the  first  two  jobs  than  is  usuall)  required, 
and  were  frequently  assigned  to  carefully  se 
lected  non-mounting  jobs  in  the  factor)  be 
fore  completing  the  training.  The  experi- 
mental group,  which  consisted  of  233  trainees, 
received  a  total  of  1895  tests  °"  welding  jobs 
while  in  the  school.  1791  of  these  were  dis- 
tributed over  the  three  welding  jobs  de- 
scribed,  yielding  an  average  ol  7.7  tests  per 
trainee  for  these  three  jobs.  Considering  the 
reliability  coefficients  presented  in  fable  1. 
there  can  be  little  question  that  the  related 
abilities  of  the  majority  of  the  experimental 
group  were  adequately  measured  in  the 
school.  If  the  group  includes  individuals  who 
were  judged  on  unreliable  or  different 
grounds,  these  would  serve  to  reduce  the  cor- 
relations between  the  tests  and  the  criterion. 
Thus,  if  satisfactory  validity  coefficients  are 
obtained,  it  can  be  assumed  that  subjects 
were  judged  with  sufficient  reliability  and 
uniformity. 

6.  Trainees  who  were  rated  by  the  instruc- 
tors during  the  period  when  the  supen  isor  of 
the  school  was  on  vacation  were  eliminated. 

In  this  way,  a  group  of  233  subjects 
was  selected  for  further  study. 


V.  Results  and  Discission 


Two  independent  sots  of  data  were 
thus  pro<  ured  in  the  course  of  the  ex- 
periment,  namely,  the  ratings  made  by 
the  supervisor  of  the  school  and  the  skill 
of  instructors,  and  the  respective  test 
scores.  It  remains  to  present  and  ex- 
amine  both  sets  of  data,  to  determine  the 
degree  of  relationship  between  each  of 
tlu    usts   and    the  criterion  and  to  de- 


1.75.  Since  the  test  of  significance  is  car- 
ried out  with  an  infinite  number  of  de- 
grees of  freedom,  t  must  equal  or  exceed 
1.64,  1.96,  or  2.33,  in  order  to  be  signifi- 
cant at  the  10%,  5%  or  1%  levels,  re- 
spectively. Thus,  while  not  significant  at 
the  5%  level  as  usually  defined,  a  nega- 
tive skewness  as  great  or  greater  could 
arise  by  chance,  under  these  conditions, 


velop  a  regression  equation  which  will      only  about  4.5%  of  the  time  (14).  Fur- 


predict  performance  in  the  school  with  a 
minimum  of  error. 

A.   THE  CRITERION  SCORES 

The  number  and  percent  of  subjects  re- 
ceiving  each  of  the  five  overall  perform- 
ance ratings  are  presented  in  Table  4. 
Inspection  of  the  data  reveals  an  excess 
of  ratings  at  the  high  end.  The  extent 
of  these  deviations  from  normality  and 
the  likelihood  that  they  could  have  arisen 
bv  chance  alone  will  be  determined  by 
applying  the  tests  for  skewness  and  kur- 
tosis  to  the  data. 

Table  4 

Distribution  of  Overall  Ratings 

(N  =  233) 


ther  evidence  is  found  in  the  distribution 
of  359  ratings,  obtained  under  similar 
circumstances,  immediately  prior  to  the 
present  experiment.  The  number  and 
percent  of  subjects  receiving  each  of  the 
five  ratings  from  Poor  to  Very  Good  were 
32  (8.9%),  59  (16.4%),  86  (24.0%),  96 
(26.7%)  and  86  (24.0%).  It  appears  safe 
to  conclude,  therefore,  that  the  negative 
skewness  of  the  present  distribution  did 
not  arise  by  chance  alone. 

The  statistic,  g2,  was  calculated  to  be 
—  0.854,  with  a  standard  error  of  0.318. 
Since  t,  for  these  values,  is  2.69,  platy- 
kurtosis  is  demonstrated  at  better  than 
the  1%  level. 

Thus  the  criterion  distribution  departs 
significantly  from  normal  with  respect  to 
kurtosis  and  exhibits  a  strong  tendency 
toward  negative  skewness. 

2.  Differential  Reliability  Along  The 
Rating  Continuum. 

1.  Analysis  of  Distribution  for  Skew-  The  activity  of  judging  performance 

ness  and  Kurtosis.  in  terms  of  the  steps  of  a  rating  scale 

Assigning  the  numbers,  1,  2,  3,  4,  and  must  be  viewed  against  the  backdrop  of 

5  to  the  steps  of  the  rating  scale,  starting  the  needs  and  the  practical  demands  of 

with  Poor,  the  mean  of  the  distribution  the  situation,  and  the  rater's  reaction  to 

becomes  3.86,  where  3.50  is  the  theoreti-  them,  if  the  results  are  to  be  interpreted 

cal  midpoint  of  the  scale.  The  standard  and  evaluated  properly, 

deviation,  in  terms  of  the  assigned  num-  One  aspect  of  the  situation  which  un- 

bers,  is  1.24.  The  test  for  skewness  yields  doubtedly  affected   the   ratings  was  the 

a  value  of   —0.278  for  gx    (15),  with  a  fact  that  the  supervisor  of  the  school  had 

standard  deviation  of  0.159  and  a  t  of  to  decide  among  three  possible  courses 


Rating 

N 

% 

Poor 

21 

9.0 

Fair 

36 

15-4 

Average 

67 

28.8 

Good 

57 

24-5 

Very  Good 

52 

22.3 
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of  action  for  each  employee,  (a)  Termina- 
tion. Since  the  trainees  were  already  em- 
ployees of  the  company,  every  effort  was 
made  to  avoid  this  action,  (b)  Plat  omul 
on  a  relatively  simple  job  other  than 
mounting.  Because  the  demand  for 
mounters  was  greater  than  for  any  other 
single  occupation,  considerably  more 
time  was  spent  in  evaluating  the  apti- 
tudes of  the  poorer  operators.  This  was 
necessary  to  avoid  excluding  trainess  with 
a  reasonable  chance  of  success  in  mount- 
ing and  to  assess  other  potentialities  and 
interests  which  might  indicate  placement 
on  another  type  of  work,  (c)  Placement 
in  the  factory  as  a  mounter.  This  course 
of  action  frequently  entailed  additional 
decisions  relating  to  the  difficulty  of  the 
job  which  could  be  handled  by  the 
trainee.  In  addition,  while  the  responsi- 
bility for  final  placement  rested  with  the 
school  supervisor,  the  number  of  opera- 
tors to  be  assigned  to  any  specific  manu- 
facturing area  during  any  period  was 
controlled  by  a  priority  system,  devel- 
oped as  a  wartime  measure  and  operated 
by  those  in  charge  of  planning  and 
scheduling  production.  Thus,  even 
though  considerable  freedom  was  allowed 
in  order  to  insure  maximum  long-range 
utilization  of  the  manipulative  and  other 
abilities  demonstrated  in  the  school,  the 
practical  question  frequently  became, 
"Can  this  trainee  do  the  job  to  which  we 
should  assign  her?"  Under  these  circum- 
stances, it  is  not  surprising  that  more 
time  was  spent  determining  whether  or 
not  the  trainee  had  the  minimum  degree 
of  manipulative  ability  required  by  the 
specific  job  under  consideration,  than  in 
carefully  measuring  the  differences  in 
ability  amon«  those  whose  success  seemed 
assured. 

Of  equal  importance  is  the  fact  that 
the  poorer  operators  required  more  train- 
ing and  attention  than  those  who  grasped 


instructions  quickly  and  encountered  few 

problems.  It  is  only  natural  that  the  in- 
structors should  have  allo(  ated  theii  time 
in  accordance  with  the  needs  ol  the  indi- 
vidual trainees.  1  lu  onl\  exception  to 
this  general  tendenq  occurred  in  the 
handling  of  trainees  assigned  to  mount- 
ing operations  involved  in  exceptionally 
small  and  delicate  types  of  tubes.  1  h<  se 
girls  were  selected  from  the  Good  and 
Very  Good  groups  dining  their  last  da) 
in  the  school  and  were  given  training, 
practice  and  production  tests  with  a  finei . 
more  difficult  type  of  assembly.  Bui  these 
additional  measures  did  not  result  in  a 
more  precise  grading  of  all  the  trainei  s  in 
the  two  highest  rating  groups.  Not  onl) 
were  many  trainees  with  high  ratings 
sent  to  other  jobs,  but  the  selection  ol 
trainees  for  these  operations  depended 
upon  the  requirements  of  the  factoi 
the  particular  time. 

Thus,  because  of  the  practical  de- 
mands of  the  situation  in  which  the  judg- 
ments were  made,  it  is  highly  probable 
that  employees  at  the  lower  end  of  the 
scale  were  rated  with  greater  precision 
than  those  at  the  higher  end.  These  facts 
had  an  important  bearing  upon  the  selec- 
tion of  statistical  methods  for  the  re- 
mainder of  the  study. 

3.   The  Selection  of  Appropriate  Sta- 
tistical Methods. 

The  existence  of  differential  reliability 
between  the  higher  and  lower  sups  of  the 
scale  points  to  the  biscrial  r  as  tin  sta- 
tistic which  will  properly  evaluate  tie 
lationships  between  the  test  scon, 
the  criterion.  According  to  this  method, 
the  criterion  distribution  is  divided  into 
two  pails  In  (hawing  a  line  through  a 
suitable  point,  somewheri  .don-  the  rat- 
ing scale.  Trainees  with  higher  ratings 
are  plat  ed  in  on-  group  and  thos<  1  it(  d 
below   the   line   are   placed   in  a  second 
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group.  Since  iliis  dividing  Line  could  be 
drawn  between  any  two  contiguous  steps 
on  the  scale,  a  choice  must  be  made  from 
among  four  possible  lines.  Ordinarily, 
the  differences  in  size  among  the  coeffi- 
cients yielded  1>\  the  various  divisions 
would  fall  within  the  limits  of  the  stand- 
ard error  of  the  coefficients;  that  is.  varia- 
tions in  size  would  be  due  to  chance 
errors  alone.  Normally,  therefore,  the 
choice  of  dividing  line  is  of  little  conse- 
quence, excepting  in  so  far  as  the  relative 
sizes  of  the  two  groups  affects  the  magni- 
tude of  the  standard  error  itself.  How- 
ever,  since  the  practical  situation  de- 
manded more  precise  classification  of  the 
group  at  some  points  along  the  scale  than 
at  others,  variations  among  the  four  pos- 
sible correlation  coefficients  for  any  test 
would  probably  exceed  that  assignable 
to  chance  errors  alone.  In  order  to  insure 
that  the  choice  would  be  made  on 
grounds  other  than  the  characteristics  of 
the  resultant  arrays,  the  following  inter- 
office communication  was  sent  to  the 
supervisor  of  the  school. 

Taking  it  lor  granted  that  people  rated 
Very  Good  and  Good  are  good  risks  for 
mounting,  which  of  the  following  statements 
is  most  nearly  true?  Check  one. 

(  )  People  rated  Poor  are  not  good 
risks  for  mounting;  people  rated  Fair 
and  above  are  good  risks. 
(  )  People  rated  Poor  and  Fair  are  not 
good  risks  for  mounting;  people  rated 
Average  and  above  are  good  risks. 
(  )  People  rated  Poor,  Fair  and  Average 
are  not  good  risks  for  mounting;  peo- 
ple rated  Good  and  above  are  good 
risks. 

The  sheet  was  returned  with  the  sec- 
ond statement  checked,  indicating  that 
trainees  rated  Poor  and  Fair  were  re- 
garded as  "unsatisfactory,"  whereas  those 
rated  Average  and  above  are  considered 
"satisfactory."  Some  time  later,  while  dis- 
cussing this  response,  the  supervisor  re- 


marked thai  il  it  were  at  all  possible,  she 
would  not  place  in  mounting  anyone  who 

failed  to  achieve  the  performance  stand- 
ards required  for  a  rating  of  at  least 
Average. 

For  these  reasons,  the  dividing  line  was 
located  between  the  ratings,  Fair  and 
Average,  in  calculating  the  biserial  cor- 
relations between  the  ratings  and  the  re- 
spective tests.  Since  the  biserial  r  is  math- 
ematically equivalent  to  a  product  mo- 
ment r,  corrected  for  broad  categories, 
this  procedure  provides  a  statistic  which 
can  be  used  in  developing  a  regression 
equation.  The  relatively  slight  loss  in 
precision  which  attends  the  broad  group- 
ing of  applicants  will  be  reflected  in  the 
standard  error  employed  for  testing  sig- 
nificance.8 

Another  problem  posed  by  the  nature 
of  the  criterion  distribution  concerns  the 
scale  to  which  the  regression  equation 
must  predict.  In  order  to  avoid  the  com- 
plex mathematical  procedures  of  curvi- 
linear regression,  which  predictions  to  the 
original  platykurtic  and  negatively 
skewed  distribution  would  entail,  an  ar- 
bitrary scale  was  constructed  by  selecting 
a  mean  of  50.000  and  a  standard  devia- 
tion of  14.286  (=  100/7),  as  parameters 
of  the  criterion  distribution.  Thus  the 
range,  M  ±  5.5a,  spans  the  numbers  from 
zero  to  one  hundred  and  steps  of  0.7a 


8  In  a  previous  experiment  with  the  same  cri- 
terion, correlation  coefficients  were  calculated  by 
the  product-moment  method  and  biserially,  with 
the  dividing  line  drawn  between  the  Fair  and 
Average  ratings.  The  biserial  coefficients  were 
consistently  higher.  This  was  interpreted  as  sub- 
stantiating the  conclusions  reached  through 
analysis  of  the  psychological  situation  of  the 
raters.  Strictly  speaking,  the  biserial  coefficients 
represent  the  amount  of  correlation  that  would 
be  obtained  by  the  product-moment  method, 
utilizing  the  five  rating  grades  and  correcting  for 
broad  categories,  if  discriminations  were  made 
with  the  same  precision  between  each  of  the  con- 
tiguous pairs  of  ratings  on  the  scale  as  between 
the  ratings,  Fair  and  Average. 
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become  steps  of  to  points  on  the  assumed 
( riterion  scale. 

The  procedures  followed  to  relate  pre- 
dictions on  the  assumed  scale  to  the  five 
grades  of  the  original  rating  scale  will  be 
described  in  Section  E  of  this  chapter. 

B.  THE  TEST  SCORES 

1.  Means  and  Standard  Deviations  of 
Test  Scores. 

In  Table  5  the  mean  and  standard 
deviation  of  the  scores  on  each  test  are 
entered,  together  with  their  respective 
standard  deviations.  The  means  (M)  rep- 
resent the  average  performance  of  the 
experimental   group   on    the    tests.   The 

Table  5 

Means  and  Standard  Deviations 

of  Test  Scores 

N  =  233 


Tests 

M 

<TM 

ffilist. 

<*a 

Placing 

Turning 

Finger  Dexterity 

Purdue  Assembly 

Tweezer  Dexterity 

171-9 
156.7 

92 .0 
127.9 

99.4 

O.84 

0-9S 
O.64 
I  .  T2 
O.87 

12.8 

14.4 

9-7 
17. 1 
13.2 

o-59 

0.67 

0-4S 
0.79 
0.61 

precision  with  which  the  mean  was  de- 
termined (i.e.,  the  amount  of  variation 
among  sample  means  to  be  expected 
upon  replication)  is  indicated  by  the  re- 
spective values  of  jm-  The  standard  devia- 
tion of  the  distribution  (c  dist.)  for  each 
test  serves  as  a  measure  of  the  extent  to 
which  individuals  differ  in  test  perform- 


ance, whereas  the  standard  deviation  of 
this  statistic  (aa)  reflects  the  precision 
with  which  it  has  been  determined.  It  i^ 
apparent  from  these  data  thai  the  mean 
and  the  standard  deviation  ol  the  distri- 
bution have  been  determined  with  rea- 
sonable precision. 

2.  Reliability  of  The  Test  Scores. 

Data  pertinent  to  the  reliability  ol  tin 
total  scores  on  each  test,  as  applied  to 
the  experimental  group,  are  recorded  in 
Table  6.  For  the  placing  and  tinning 
tests,  the  initial  product-moment  r  was 
calculated  between  the  sum  of  the  scon  s 
on  the  first  two  trials  and  the  sum  of  the 
scores  on  the  third  and  fourth. 

The  reliability  of  the  scores  based 
upon  four  trials  was  then  computed  by 
the  Spearman-Brown  prophecy  formula. 
Thus  the  indices  in  the  fourth  column 
represent  the  reliability  of  the  total  scores 
on  these  tests  for  the  experimental  group. 
The  initial  r's  for  the  remaining  three 
tests  were  calculated  between  the  scores 
on  the  first  and  second  trials,  by  the  prod- 
uct-moment method,  and  extended  by  the 
prophecy  formula  for  two  or  three  trials 
as  indicated. 

For  mass  testing  purposes,  the  coeffi- 
cient of  reliability  is  significant  primarily 
because  of  its  effect  upon  the  validity  co- 
efficient. Not  only  does  it  set  a  ceiling 
on  possible  validity  (18),  but  the  chance- 
errors  introduced  as  reliability  decr< 


Te*t 


Placing 
Turning 
Finger  Dext. 
Tweezer  Dext. 
Purdue  Assembly 


Table  6 
The  Reliability  of  Test  Scores 


No.  of 
Trials 


4 
4 
2 
2 
3 


Time/Trial 


40  sec. 
30  sec. 
3.5  min. 
3.0  min. 
1 .0  min. 


CoelT.  pf 
Reliability 


.885 

.822 
.869 
.S20 
.906 


]•  I  . 


2.9  blocks 
3.  7  blocks 

2 .4  1k.1i  - 

3  -'s  pins 

3-5  pan 
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attenuate  validity  coefficients  at  all  levels.  The  lowest  correlation  (.369)  is  more 
Had  these  data  been  available  prior  to  than  twice  that  required  lor  significance 
the   experiment,    the   number   of   trials     at  the  1%  level  with  231  degrees  of  free- 


would  have  been  selected  to  achieve  a 

more  uniform  level  of  reliability  on  all 

five  tests. 

Proper  interpretation  of  an  individ- 
ual's score  on  a  test  requires  an  estimate 
of  the  probable  divergence  of  the  appli- 
cant's true  score  from  the  one  obtained. 
In  so  far  as  the  discrepancy  is  due  to 
chance  errors,  such  as  fumbling,  getting 
off  to  a  bad  start,  the  portion  of  the  last 
motion  cycle  incomplete  at  the  stop  sig- 
nal, momentary  confusion  in  method, 
etc.,  its  probable  magnitude  can  be  esti- 
mated by  the  probable  error  of  measure- 
ment, given  in  the  last  column  of  the 
table.  For  example,  the  chances  are  about 
even  that  an  individual's  true  score  on 


cloin  (15).  Since  all  correlations  are  posi- 
ii\e.  it  is  obvious  that  applicants  who 
score  highly  on  one  test  tend  to  do  well 
on  each  of  the  remaining  tests.  Similarly, 
there  is  a  tendency  for  applicants  who  do 
poorly  on  one  test  to  obtain  low  scores  on 
each  of  the  others.  The  strength  of  these 
tendencies  is  indicated  by  the  magnitude 
of  the  product  moment  coefficients  in  the 
table. 

C.   THE  CORRELATIONS  BETWEEN   TESTS 
AND  CRITERION 

The  extent  to  which  scores  on  the  re- 
spective tests  are  related  to  the  criterion 
is  indicated  by  the  magnitude  of  the  bi- 
serial  coefficients  in  the  second  column  of 


the  placing  test  will  fall  within  the  limits      Table  8.  The  standard  error  employed  in 

the  t  test  was  computed  by  setting  rbis 
equal  to  zero  in  the  formula  for  that 
statistic  (14).  Since  a  t  of  2.60  is  signifi- 
cant at  the  1%  level  for  these  conditions, 
all  coefficients  are  highly  significant. 


defined  by  the  obtained  score  ±  three 
(2.9)  blocks.  In  slightly  more  than  four- 
fifths  of  the  cases  the  individual's  true 
score  will  lie  within  the  range,  the  ob- 
tained score  ±  six  (5.8)  blocks.  Thus 
small  differences  among  applicants  on 
the  tests  do  not  necessarily  indicate  true 
differences  in  the  capacities  measured. 
Differences  corresponding  to  those  item- 
ized in  the  table  may  be  entirely  disre- 
garded in  the  employment  office. 

3.  Ititertest  Correlations. 

The  intercorrelations  between  the  tests 
of  the  battery  are  presented  in  Table  7. 


Table  8 

Correlations  Between  Tests 
and  Criterion 

(N  =  233) 


Tests 


rbi 


Placing  -564 

Turning  -499 

Finger  Dexterity  .482 

Purdue  Assembly  .636 

Tweezer  Dexter  it  y  .586 


6.56 

5-79 
5.60 

7-39 
6.81 


Table  7 
Intertest  Correlations 

(N  =  233) 


T 


FD 


PA 


TD 


Placing  (P) 

Turning  (T) 

Finger  Dexterity  (FD) 

Purdue  Assembly  (PA) 

Tweezer  Dexterity  (TD) 


•55! 


.461 
■459 


.462 

•5" 
.418 


•395 
■369 
.510 

•439 
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Having  established  that  each  of  the 
five  tests  is  significantly  related  to  the 
criterion,  several  questions  arise.  Should 
all  d\(  tests  be  administered  in  the  selec- 
tion of  future  mounter  trainees?  If  not, 
which  of  the  tests  should  he  included  in 
the  final  battery?  How  should  the  scores 
on  the  selected  tests  be  weighted  to  insure 
maximum  forecasting  efficiency?  These 
and  related  questions  will  be  discussed  in 
the  following  section. 

D.  THE  COMPOSITION  AND  YIELD  OF 
SELECTED  TEST  BATTERIES 

Hull  (10),  in  1928,  clearly  demon- 
strated "the  radical  tendency  to  diminish- 
ing returns  as  successive  tests  are  added  to 
the  battery,"  the  rate  depending  upon 
the  absolute  and  relative  sizes  of  the  va- 
lidity coefficients  and  the  intertest  corre- 
lations. It  has  also  been  known  for  many 
years  that  the  multiple  correlation  co- 
efficient is  spuriously  high,  due  to  the 
accumulation  of  all  chance  errors  in  the 
positive  direction.  Thus,  successive  tests 
usually  contribute  progressively  less  to 
the  forecasting  efficiency  of  a  battery, 
while  each  adds  its  full  share  of  chance 
errors.  The  rate  of  diminishing  returns 
is  therefore  greater  than  indicated  by  the 
decreasing  increments  to  the  multiple  R. 
Mme  recently,  Wherry  (16)  has  added 
to  the  Doolittle  method  (14),  a  tech- 
nique which  indicates  the  optimum  order 
in  which  tests  should  be  added  to  the  bat- 
tery, starting  with  the  test  of  highest 
validity,  and  which  yields  a  series  of 
"shrunken  multiple  correlations,"  (R), 
corrected  for  the  cumulative,  positive, 
chance  errors  added  by  each  successive 
test. 

The  results  in  Table  9  were  obtained 
by  applying  the  Wherry-Doolittle  Test 
Selection  Method  to  the  present  data. 
Most  striking  is  the  amount  of  agreement 
between  the  regular  and  the  "shrunken" 


I  ABLE  9 

The  Effect  of  Successive  Additions  to  the  Test 

Battery  on  its  Relationship 

Willi  the  Criteru  n* 


Optimum  on'er  of 

test  addition 


R 


R       100    R)> 


Purdue  Assembly  (Zero- 

order  r) 

.636 

.636 

40.4% 

I  weczer  Dexterity 

.722 

.720 

5i-8% 

Placing 

•757 

•755 

57  .0% 

F  inger  Dexterity 

.761 

.760 

57-8% 

Turning 

.762 

.761 

57-9% 

*  Optimum  on'er  of  test  addition  as  deter- 
mined by  Wherry  Shrinkage  Formula. 

R — Multiple  correlation  coefficie  nt. 

R — Shrunken  multiple  correlation  coefficient. 

100    (R)2 — Coefficient   of   determination, 
pressed  as  a  percent. 

multiple  correlation  coefficients.  Differ- 
ences are  reflected  only  in  the  third  sig- 
nificant figures.  Moreover,  the  calculated 
shrinkage  decreases  from  0.002  to  0.001 
with  the  addition  of  the  fourth  and  fifth 
tests,  respectively.  There  are  several  rea- 
sons for  this  result. 

1.  The  Wherry  shrinkage  formula  contains 
the  fraction,  (N  —  i)/(N  —  M),  where  N  is 
the  number  of  cases  and  M,  the  number  of 
variables  in  the  regression  equation.  I  hus, 
little  shrinkage  is  contributed  by  this  factor 
when  the  number  of  cases  in  the  stud\  is 
large  and  the  number  of  variables  undei  con- 
sideration is  small.  In  the  present  study,  this 
fraction  differs  from  one  only  in  the  fourth 
significant  figure  as  the  second  and  third  tests 
are  added,  and  in  the  third  significant  figure 
with  the  addition  of  the  fourth  and  fifth 
tests. 

2.  The  magnitude  of  the  shrinkage  is  de- 
termined partly  by  the  amount  of  covariance 
added  by  a  given  test.  In  the  case  of  the 
fourth  and  fifth  tests,  the  amount  of  added 
covariance  is  small,  as  indicated  by  the  in- 
crements to  the  unshrunken  R. 

3.  As  with  the  zero-order  r's,  Wherry- 
Doolittle  calculations  were  carried  out  to  four 
significant  figures.  Apparently  the  cumulative 
effect  of  rounding  the  fourth  significant  fig- 
ure was  sufficient  to  cause  the  difference  be 
tween  the  regular  and  the  shrunken  co- 
efficients to  be  smaller    (by  .001)  with  four 
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or  five  tests  in  the  battery  than  with  two  or 
three  tests. 

The  proper  testing  of  the  significance 
of  the  tabulated  R's  is  complicated  by  the 
fact  that  the  zero-order  r's  were  calculated 
biserially.  Although  the  biserial  r  is  math- 
ematically equivalent  to  a  product  mo- 
ment r  corrected  for  broad  categories, 
its  standard  error  is  somewhat  larger  than 
whin  computed  by  the  product  moment 
method.  Consequently  one  would  expect 
that  a  given  multiple  R  would  be  signifi- 
cant at  a  higher  level  when  compounded 
of  product  moment  r's  than  is  the  case 
when  they  are  calculated  biserially.  How- 
ever, since  the  tabulated  R's  exceed  so 
considerably  the  usual  requirements  for 
significance  at  the  1%  level  (15),  a  test  of 
greater  refinement  is  rendered  unneces- 
sary  for  these  data. 

The  entries  in  the  fourth  column  of  the 
table  show,  for  each  R,  the  coefficient  of 
determination    expressed    as   a    percent. 
Since  equal  increments  to  this  coefficient 
have  the  same  significance  at  all  points 
along  the  scale,  it  provides  a  more  suit- 
able comparative  measure  of  the  contri- 
bution of  successive  tests  than  either  the 
regular  or  the  shrunken  multiple  corre- 
lation  coefficients.   Thus,   40.4%   of   the 
criterion  variance  is  assignable  to  vari- 
ance in  the  capacities  measured  by  the 
Purdue  assembly  test.  With  the  addition 
of  the  tweezer  dexterity  test,  the  amount 
of  criterion  variance  accounted  for  be- 
comes 51.8%,  an  increase  of  n-4%-  The 
further  inclusion  of  the  placing  test  raises 
this  figure  to  57.0%,  an  increase  of  5.2%. 
Increments   of  0.8%   and   0.1%,   respec- 
tively, would  attend  the  addition  to  the 
battery  of  the  finger  dexterity  and  turn- 
ing tests. 

It  appears  therefore  that,  while  the 
chance  errors  added  to  the  battery  by  the 
successive  tests  have  not  completely  out- 


weighed the  gains  in  forecasting  efficien- 
cy, it  would  almost  certainly  be  uneco- 
nomical of  time  and  money  to  include 
the  finger  dexterity  and  turning  tests  in 
the  final  battery. 

The  tlnee-tcst  equation.  Considering 
both  forecasting  efficiency  and  optimum 
utilization  of  testing  time,  a  multiple 
regression  equation,  based  upon  the  first 
three  tests  of  Table  9,  was  developed.  In 
standard  score  form,  this  equation  reads 

zc  =  .374zza  +  .316/.,,,  +  .267zp 
where  subscript,  c,  indicates  the  criterion 
and  subscripts  pa,  td  and  p  represent  the 
Purdue  assembly,  tweezer  dexterity  and 
placing  tests,  respectively.  Substituting 
the  means  and  sigmas  required  to  put  this 
equation  into  the  more  convenient  raw 
score  form,  yields  the  following:9 
Xc  =  4i4XB&  +  45iXtd+.392Xp-. 11-5.2 
The  standard  error  of  estimate,  in  terms 
of  the  parameters  of  the  assumed  cri- 
terion scale,  is  9.33. 

The  two-test  equation.  A  second  equa- 
tion, based  upon  the  Purdue  assembly 
and  tweezer  dexterity  tests,  was  developed 
for  occasions  when  only  limited  time  is 
available  for  testing  applicants.  Utilizing 
the  same  symbols  and  subscripts  as  above, 
this  equation  quantitatively  relates  the 
standard  and  raw  scores  of  the  respective 
measures  as  follows: 

Zc  =  -469Zpa  +  -38oz<<i 
and, 

Xc  =  -545XPa  +  -569td  -  7°-2 
The  standard  error  of  estimate  of  the 
equation  in  raw  score  form  is  9.88.  The 

9  The  standard  deviation  of  the  assumed  cri- 
terior  scale  (14.286)  was  divided  by  the  multiple 
correlation  coefficient  (.757)  before  substitution, 
in  order  to  correct  the  predictions  of  the  raw 
score  form  of  the  equation  for  reduced  disper- 
sion (10).  While  this  correction  is  not  generally 
found  in  standard  texts,  its  effect  on  the  spread 
of  the  predicted  criterion  scores  is  so  considerable 
as  to  make  it  imperative  when  the  standard  devia- 
tion of  the  predicted  scores  is  used  to  evaluate  the 
performance  of  applicants.  The  raw  score  form 
of  the  two-test  equation  is  similarly  corrected. 
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increase  from  9.33  represents  the  loss  in 
precision  resulting  from  the  elimination 
<>l  the  placing  test.  The  loss  is  not  sub- 
stantial compared  with  the  magnitude  ol 
the  standard  errors  themselves.  A  similar 
conclusion  was  suggested  above,  when 
the  addition  of  the  placing  test  to  the 
battery  increased  the  amount  of  criterion 
variance  accounted  lor,  from  51.8%  to 
57.0%,  an  increment  of  5.2%. 

The  standard  error  of  estimate  is  a 
measure  of  the  effectiveness  of  the  bat- 
tery when  interest  centers  in  predicting 
the  performance  of  individual  applicants. 
For  example,  if  job  performance  were 
graded  on  the  assumed  criterion  scale,  the 
criterion  score  predicted  by  the  two-test 
equation  will  be  in  error  by  9.33  or  less 
in  roughly  two-thirds  of  the  cases.  By 
the  same  token,  a  greater  error  will  at- 
tend the  predictions  in  one-third  of  the 
cases.  When  one  considers  the  range  of 
scores  on  the  criterion  scale,  the  errors 
for  individual  prediction  are  consider- 
able. However,  as  Garrett  (4)  has  stated, 

It  may  be  argued  .  .  .  that  in  attempting 
to  predict  individual  performance  from  test 
scores  we  are  asking  too  much  of  our  battery 
of  tests— more  than  we  have  a  right  to  ex- 
pect. .  .  .  From  tables  of  life  expectancy  one 
can  tell  quite  accurately  how  many  men,  now 
aged  30,  will  survhe  to  age  50.  But  predic- 
tion of  the  life  span  of  a  given  individual  is 
a  dubious  undertaking. 

Tiffin  (18),  working  from  the  Taylor- 
Russell  tables,  comes  to  a  similar  conclu- 
sion. Both  suggest  the  percent  of  proper 
placements  as  a  more  suitable  measure 
of  the  eirectiveness  of  tests  in  the  employ- 
ment situation.  Accordingly,  data  from 
the  present  experiment  will  be  analyzed 
from  this  point  of  view  in  a  later  section 
of  this  chapter. 

The  raw  score  forms  of  the  two  equa- 
tions define  the  optimum  use  of  the  test 
scores  in  predicting  the  performance  of 


applicants  in  the  Training  school.  An\ 
weighting  ol  the  usts.  othei  than  those 
indicated  bv  the  respective  equations, 
would  result  in  a  loss  in  forecasting  effi- 
ciency with  respect  to  the  criterion,  foi 
the  experimental  group.  In  ordei  to  re- 
duce the  computational  labor  involved  in 
applying  these  equations  to  a  mass  testing 
situation,  facilitating  tables  (10)  were 
developed. 

E.  CONVERTING  PREDICTIONS  ON    Jill 

ASSUMED  SCALE  TO  GRADI  s   ON 

THE  RATING  SCALE 

It  was  stated  earlier  that,  instead  ol 
endeavoring  to  predict  directly  to  the 
original  platykurtic  and  negatively 
skewed  rating  distribution,  an  assumed 
criterion  scale  was  constructed  In  select- 
ing a  mean  of  50.000  and  a  standard  de\  i- 
ation  of  14.286  as  parameters  ol  the 
criterion  distribution.  It  now  becomes 
necessary  to  relate  the  predictions  made 
by  the  raw  score  forms  of  tin  equations 
to  the  five  grades  of  the  original  rating 
scale.  This  was  accomplished  by  dividing 
the  area  under  the  normal  curve  defined 
by  the  assumed  parameters  into  five 
parts,  so  that  the  percentage  of  the  total 
area  in  each  of  the  five  segments  cor- 
responded successively  to  the  percentag<  s 
in  each  of  the  five  rating  categories.  Foi 
example,  9.0%  of  the  trainees  were  rated 
Poor.  The  point  on  the  assumed  scale, 
below  which  9.0%  ol  the  people  s(oi.  is 
1.347  below  the  mean.  Since  tin-  si, unl. ud 
deviation  was  taken  to  be'  1  L286,  this 
corresponds  to  a  distance  of  19.1  below 
the  assumed  mean  ol  50.0,  01  .1  scor<  .,1 
30.9.  Thus,  when  test  scores  are  subsii 
tuied  in  the  regression  equation  ami  a 
criterion  score  <>l  30.9  01  less  is  obtained, 
this  is  equivalent  to  .1  predi<  tion  ol  Pooi . 
A  similar  treatment  applied  to  each  step 
of  the  original  scale,  yielded  foui  <iii< 
rion  seoies  which  define  the'  separation 
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between  contiguous  pairs  of  ratings. 


Poor— Fair 

3°-9 

Fair— Average 

.]().! 

Average   Good 

r,l.l 

Good   Verj  Good         (10.9 

These  results  wore  used  to  locate  the 
ratings  at  the  top  of  the  chart  which  will 
be  described  in  the  next  section. 

F.   CHART  FOR  REPORTING  TEST 
PERFORMANCE 

In  order  to  present  test  results  to  the 
employment  office  in  a  way  which  would 
depict  the  relative  standing  of  the  ap- 
plicant in  precise  and  readily  compre- 
hensible form,  a  chart  similar  to  Figure 
g,  was  devised.  When  an  applicant  is 
tested,  raw  scores  are  posted  on  the  short 
horizontal  lines  at  the  right.  The  relative 
standing  of  the  applicant  on  each  test  is 
then  denoted  by  a  red  pencil  mark  at 
the  appropriate  point  along  the  adjacent 
horizontal  line  in  the  body  of  the  chart. 


Weighted    scores    from    the    facilitating 

tables  are  written  to  the  right  of  the  raw 
stores  and  added.  Since  the  weighted 
stores  in  the  tables  were  calculated  in 
sue  h  a  way  as  to  absorb  a  large  part  of  the 
constants  (115.2  or  7o\->)<  20.0  is  sub- 
tracted mentally  from  the  sum  and  the 
result  is  posted  on  the  short  line  to  the 
right  of  the  scale  for  the  overall  score. 
The  corresponding  point  on  the  chart 
is  checked  and  represents  the  statistically 
best  prediction  of  the  applicant's  per- 
formance which  can  be  made  on  the  ')asis 
of  the  test  results. 

Thus  the  chart  presents  graphically 
the  standing  of  an  applicant  on  each  of 
the  tests  as  well  as  a  prediction  of  her 
future  job  performance.  Each  of  these 
items  can  be  read  on  the  scale  of  overall 
scores,  which  putatively  represents  units 
of  equal  ability;  on  the  percentile  scale, 
which  shows  the  percentage  of  appli- 
cants scoring  better  than  the  one  under 


t  Scon  ng  better 
than  apo • 1  cant 


Overal I  Score 


Purdue  Assembl  y 


Tweeze  r  Dexter  1  ty 


PI ac 1 ng 


POOR 


AVERAGE 


VERY  GOOD 


Fig.  3.  Chart  for  Reporting  Test  Performance  and  Predicted  Criterion  Score  to  the 

Employment  Office. 
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consideration:  or,  as  a  predicted  job  per- 
formance rating.  The  scale  values  of  all 
measures  were  equated  by  locating  each 
of  the  ten  major  divisions  on  the  chart  as 
a  multiple  of  0.73  from  the  mean  of  each 
distribution.  This  fact  makes  it  possible 
to  regard  all  tests  as  if  they  were  scored 
on  the  same  numerical  scale  as  the  over- 
all score.  It  would  have  been  unnecessary, 
therefore,  to  have  recorded  the  raw  test 
scores  on  the  chart,  except  that  this  facili- 
tates marking  the  standing  of  an  appli- 
cant on  a  given  test.  Needless  to  say,  the 
normal  curve,  drawn  in  the  body  of  the 
chart,  applies  to  all  scales. 

G.   PREDICTING  FACTORY  PERFORMANCE 

It  was  implied  above  that  the  equation 
based  upon  the  Purdue  assembly,  tweezer 
dexterity  and  placing  tests  is  to  be  pre- 
ferred, when  time  permits,  to  the  one 
utilizing  the  scores  on  only  the  first  two 
tests.  This  was  the  conclusion  reached  by 
the  author  upon  completion  of  the  ex- 
periment with  performance  in  the  Vesti- 
bule Training  School  as  the  criterion. 

At  a  later  time,  however,  when  follow- 
up  and  termination  ratings  on  subse- 
quent performance  in  the  factory  were 
available,  it  became  apparent  that  the 
two-test  equation  was  superior  in  pre- 
dicting performance  with  respect  to  these 
factory  criteria.  This  discovery  was  of 
considerable  practical  importance  since 
there  is  little  point  in  adding  a  third 
test  to  the  battery  in  order  to  account  for 
an  additional  5.2%  of  criterion  variance 
in  the  training  school  if  this  procedure 
entails  a  loss  in  forecasting  efficiency 
when  the  equation  is  applied  to  the  fac- 
tory. The  analyses  which  were  made  to 
determine  the  reasons  for,  and  magnitude 
of  this  discrepancy,  involved  the  use  of 
several  criteria  of  factory  performance 
as  well  as  a  considerable  number  of 
subjects  who  received  the  tests  before  and 


alter  the  present  study.  However,  in  order 
to  keep  this  report  within  manageable 
bounds  and  at  the  same  time  to  separate 
the  results  which  appear  to  have  worth- 
while, practical  applications  from  those 
of  lesser  value,  the  following  more  lim- 
ited evidence  from  the  pic  sent  experi 
mental  sample  is  offered.  In  this  way 
conclusions  may  rest  upon  an  empirical 
foundation  without  ambiguous  refer- 
ences to  unpublished  studies. 

To  accomplish  this,  it  is  necessary  to 
introduce  a  new  criterion.  At  the  time  of 
the  present  experiment,  it  was  the  prac- 
tice to  send  follow-up  rating  forms  to  the 
immediate  supervisor  of  all  applicants 
and  trainees  who  were  placed,  aftei  test- 
ing, on  any  one  of  the  occupations  then 
under  investigation.  The  two  most  im- 
portant items  on  the  form  were  the  in- 
formation identifying  the  specific  opera- 
tion being  performed  by  the  individual 
and  a  summary  rating  on  "Overall  job 
performance,  day  in  and  day  out."  The 
scale  consisted  of  the  four  grades,  Poor, 
Fair,  Good  and  Excellent,  which  have 
been  in  use  for  a  number  of  years  on  the 
company's  termination  form.  The  form 
was  sent  to  the  factory  when  the  opera- 
tor had  completed  her  tenth  week  on  the 
job. 

Of  the  233  subjects  in  the  experimental 
group,  122  (52.4%)  were  sent  to  the  fac- 
tory as  mounters.10  However,  63  of  these 
left  the  company  before  or  around  the 
time  when  the  follow-up  ratings  were  sent 
out.  By  far  the  majority  of  these  termina- 
tions occurred  early  in  September  and 
were  effected  by  the  individuals  them- 
selves for  the  purpose  of  resuming  then 
education.  Moreover,  of  the  ;,<)  forms 
which  should  have  been  returned,  the  n<  1 


10  Some  evidence  Eoi  the  effectiveness  ol  plaa 
minis  from  the  Vestibule  I  raining  S<  hool  is  pro 
vided  by  the  ia<i   that  no  one  in  this  group  <>i 
I--    mounters  u;is   terminated   as  "l  nsuited   i<» 
the  Work." 
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yield  of  this  sample  lor  the  criterion 
under  consideration  was  35  ratings. 

A  breakdown  of  the  35  ratings  showed 
that  no  one  was  rand  Poor,  in  all  proba- 
bility reflecting  in  part  the  reluctance  of 
supervisors  to  use  this  grade  and  in  part 
the  screening  which  occurred  in  the 
training  school.  Thirteen  were  rated 
Fair,  eighteen.  Good  and  four,  Excellent. 
For  the  purpose  of  calculating  hiserial 
correlations,  the  ;;;,  subjects  were  divided 
into  two  groups.  The  thirteen  rated  Fair 
were  classified  as  "unsatisfactory"  and  the 
twenty-two  rated  Good  and  Excellent,  as 
"satisfactory."  Because  of  the  sample 
si/e.  the  required  means  and  standard  de- 
viations were  calculated  from  the  un- 
grouped  data. 

Two  sets  of  predicted  criterion  scores 
were  derived  by  substituting  in  each  of 
the  two  regression  equations,  the  raw 
scores  obtained  by  the  subjects  on  the 
prospective  tests  at  the  time  of  hiring. 
The  resultant  scores  were  regarded  as  pre- 
dictions, on  the  assumed  criterion  scale, 
of  the  level  of  efficiency  to  be  achieved  by 
each  subject  as  a  mounter  in  the  factory. 
The  question  now  is,  which  equation 
has  made  the  better  predictions? 

The  second  column  of  Table  10  pro- 
vides the  answer  to  this  question  as  it 
relates  to  this  particular  group  of  mount- 
ers. Converting  the  correlations  to  coeffi- 
cients of  determination  and  expressing 
the  result  as  a  percent,  the  equation  based 


on  only  two  tests  accounts  for  35.4%  of 
the  criterion  variance,  whereas,  the  equa- 
tion utilizing  the  scores  on  the  three 
tests  accounts  for  18.6%. 

While  these  statements  are  entirely  cor- 
rect when  applied  to  the  performance  of 
the  35  mounters  in  question,  they  do  not 
constitute  a  good  basis  for  comparing 
the  two  equations.  A  glance  at  columns 
three  and  four  of  Table  10  will  show  that 
the  factory  group  has  a  smaller  spread 
on  the  criterion  scale,  with  respect  to  the 
abilities  measured  by  the  tests  and 
weighted  by  the  respective  equations 
than  the  original  experimental  group. 
This  is  to  be  expected  in  view  of  the 
screening  which  occurred  in  the  Vesti- 
bule Training  School.  It  is  well  known, 
however,  that  the  spread  of  the  scores  has 
a  substantial  effect  upon  the  size  of  a 
sample  r.  And,  since  the  reduction  in 
spread  is  different  for  the  two  equations, 
the  two  coefficients  are  differentially  af- 
fected. The  correlations  must  therefore 
be  corrected  for  reduced  dispersion  be- 
fore their  relative  worth  can  be  deter- 
mined. The  corrected  coefficients  (14)  are 
entered  in  column  five  of  the  table. 
Squaring  the  two  corrected  coefficients 
and  multiplying  by  100,  the  two  equa- 
tions may  be  compared  more  properly  as 
follows:  If  the  group  of  35  mounters  had 
been  distributed  over  the  criterion  scale 
in  the  same  way  as  the  total  applicant 
group,  the  two-test  equation  would  have 


Table  10 

Comparison  of  the  Two  Regression  Equations  with  Respect  to  Efficiency 
in  Predicting  the  Performance  of  Mounters  in  the  Factory* 

(N  =  35) 


Equation 

This 

0"dist. 

N  =  35 

O'dist. 

N  =  233  ' 

r' 

tr' 

2-test  equation 
3-test  equation 

•595 
•43i 

11 .96 
13-56 

14. 286 
14. 286 

.678 

•477 

3.I6 

2  .  22 

*  cdist.  for  N  =  35,  equals  the  computed  standard  deviation  of  the  predicted  criterion  scores  of  the 
35  subjects,  o-dist.  for  N  =  233,  is  the  standard  deviation  of  the  predicted  criterion  scores  of  the  entire 
experimental  group. 
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accounted  for  46.0%  of  the  criterion  vari- 
ance, whereas,  the  three-test  equation 
would  account  for  only  22.8' ; . 

However,   the   interest  of   the   reader 

who  is  concerned  with  the  practical  deci- 
sion as  to  whether  or  not  the  data  justi- 
fies the  use  of  the  tests  in  the  employment 
office,  goes  beyond  the  correlations  found 
with  the  particular  subjects  of  the  sam- 
ple. It  is  well  known  that  rather  high 
correlations  sometimes  arise  by  chance, 
with  small  samples.  Before  much  confi- 
dence can  be  placed  in  the  results,  there- 
fore, it  is  necessary  to  determine  the 
probability  that  the  obtained  correlations 
could  have  arisen  by  chance  without  the 
existence  of  a  true  relationship  between 
the  tests  and  performance.  For  this  pur- 
pose, the  standard  error  of  a  biserial  r  of 
zero  was  calculated  for  these  data.  It  came 
out  to  be  .2145.  In  deciding  which  set  of 
correlations  should  be  tested,  the  uncor- 
rected or  the  corrected,  it  should  be  noted 
that  the  standard  error  of  a  biserial  r  does 
not  take  into  account  the  reduced  disper- 
sion of  the  experimental  group.  If  a  cor- 
relation coefficient  has  been  decreased  by 
reduced  dispersion,  it  may  legitimately 
be  corrected  to  a  coefficient  which  de- 
scribes the  same  amount  of  correlation 
for  the  population,  in  this  case  the  appli- 
cant group.  Therefore,  the  tr,  entries  in 
the  last  column  of  the  table,  provide  the 
best  estimate  of  the  confidence  which  can 
be  placed  in  each  of  these  equations.  For 
33  degrees  of  freedom,  t  must  equal  2.035 
or  2.734  to  be  significant  at  the  5%  and 
1%  levels,  respectively.  Tested  in  this 
way,  the  two-test  equation  is  seen  to  be 
significant  at  far  better  than  the  1% 
level,  whereas,  the  three-test  equation  is 
significant  somewhere  between  the  ;,'  , 
and  1%  levels.  Since  practical  interest 
centers  in  coefficients  indicating  a  posi- 
tive correlation,  it  may  be  said  that  there 
is  less  than  one  chance  in  200  that  the 


obtained  correlation  between  the  predic- 
tions of  the  two-test  equation  and  job 
performance  could  have  arisen  bj  chance 
without  the  existence  ol  a  true  relation- 
ship. There  are  less  than  2.5  chances  in 
200  (i.e.,  1  in  40)  that  a  similar  statement 
is  true  for  the  three-test  equation.  I  hese 
statements  are  implicit  in  the  definitions 
of  the  1%  andr,f;  levels,  respectively. 

It  is  worthy  of  mention,  that  the  uncoi 
rected  rbis  for  the  two-test  equation  j  it  his 
a  t  of  2.77  which,  at  33  degrees  ol  El 
dom,  is  also  significant  at  the  1%  level. 
The  uncorrected  coefficient  for  the  thri  e 
test  equation  barely  misses  significance 
at  the  5%  level,  with  a  t  of  2.01.  The  fact 
that  all  four  coefficients  are  probably 
attenuated  by  chance  errors  and  the  fact 
that  neither  the  tests  nor  the  performance 
ratings  are  perfectly  reliable,  lends  fur- 
ther support  to  the  testing  of  the  cor- 
rected correlations.  This  is  true  in  spite 
of  the  fact  that  a  correlation  corrected 
for  the  lack  of  perfect  reliability  of  both 
measures  would  hardly  be  of  practical 
value,  unless  one  planned  to  lengthen 
the  tests.  Correlations  corrected  for 
chance  errors  in  supervisory  judgment 
would  be  of  interest  and,  in  all  proba- 
bility, substantially  higher.  There  is  little 
question,  however,  that  confidence  in  t he- 
results  is  enhanced  by  the  fact  that  both 
the  corrected  and  the  uncorrected  coeffi- 
cients for  the  two-test  equation  are  sig- 
nificant at  the  1%  level. 

Assuming  that  no  other  data  were 
available  and  that  the  amount  ol  confi- 
dence which  can  be  placed  in  the  predic- 
tions of  factory  performance  should  be 
a  determining  factor,  there  is  littfi  ques- 
tion that  the  two  test  equation  should  be 
selected."    Therefore,    only    the    results 


"  Since  the  l>-  and  beta-coeffi)  ients  in  the  equa- 
tions  turn    upon   relatively   small   differences  in 
zero-order  r's    the  data  are  nol  adequate  i<>  <!<■ 
I  ermine  whether  or  not  predictions  to  ilie  fac- 
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i r. mi    the   equation    which   weights   the  Eormance  (Figure  3).  To  facilitate  com- 

scores    from    the    Purdue   assembly    and  parisons  between  the  two  sketches,  selec- 

tweezer  dexterity  tests  will  be  presented  tion  ratios  increase  from  right  to  left  in 

graphically  in  the  next  sit  tion.  both  cases.  The  alternate  numeration  on 

the  abscissa  represent  the  passing  scores 
H.  GRAPHICAl    PRESENTATION  OF  corresponding    to    the    various    selection 
1111    RES1  1  i^  ratios.  These  were  derived  from  the  as- 
[t  has  already  been  established  that  the  sumed    population    parameters    of    the 
equation  utilizing  the  scores  on  the  Pin-  criterion  distribution,   and   are  directly 
due  assembly  and  tweezer  dexterity  tests  comparable   to   the  scores  on  the  over- 
makes  pied ic  t ions  which  are  significantly  all  scale  of  Figure  3. 

related  to  performance  in  both  the  train-  Curve  (A)  delineates  the  effect  of  varia- 
ing  school  and  the  factory.  Although  the  tions  in  the  selection  ratio  upon  the  per- 
graphical  presentation  of  results  will  in-  cent  of  good  and  very  good  people  in  the 
volve  restating  these  relationships  in  less  group  selected  for  the  job.  The  fact  that 
precise  (though  more  palpable)  form,  sev-  this  curve  intersects  the  vertical  line  cor- 
eral  additional  and  important  practical  responding  to  a  selection  ratio  of  100% 
conclusions  can  be  demonstrated  more  at  an  ordinate  value  of  46.8%,  indicates 
simply  in  this  way.  that  when  the  entire  group  of  233  sub- 
Tiffin  (18)  uses  the  term,  "selection  jects  was  hired,  109  (46.9%)  were  rated 
ratio."  to  designate  the  percentage  of  the  Good  and  Very  Good  in  the  school.  How- 
applicant  group  which  must  be  placed  ever,  a  substantially  greater  percentage 
on  a  particular  job.  If,  from  a  group  of  of  good  and  very  good  trainees  could 
one  hundred  applicants,  one  selects  the  have  been  sent  to  the  school  if  the  tests 
fortv  who  score  highest  on  a  test,  he  is  had  been  used  in  selection,  as  demon- 
operating  with  a  selection  ratio  of  40%;  strated  by  the  rapid  rise  in  the  curve  as 
whereas,  if  the  highest  seventy  appli-  the  selection  ratio  becomes  smaller.  It 
cants  are  placed  on  the  job,  the  selection  was  noted  earlier,  for  example,  that 
ratio  is  70%.  524%  of  the  trainees  were  assigned  to 
Figure  4  showrs  the  effect  of  the  selec-  the  factory  as  mounters.  If  this  group 
tion  ratio  on  the  percent  of  operators  had  been  selected  for  the  school  by  tests, 
wdio  fall  in  the  rating  categories  specified  62.9%  would  have  been  rated  Good  and 
on  the  curves.  The  chart  is  based  upon  Very  Good,  an  increase  of  16.1%.  If  only 
an  array  in  which  the  predicted  criterion  40%  of  the  total  group  had  been  selected 
scores  for  the  233  subjects  were  tabulated  on  the  basis  of  the  tests,  the  percent  of 
against  the  overall  performance  rating  good  and  very  good  trainees  would  have 
received  in  the  school.  Both  the  selection  risen  to  70.5%,  an  increment  of  23.7%. 
ratios  and  the  percent  satisfactory  were  The  attainable  improvements  in  the 
calculated  cumulatively  from  the  high  percent  of  very  good  people  are  even 
end  of  the  test  score  distribution.  The  more  substantial.  Thus  22.3%  of  the 
percentages  designated  as  selection  ratio  entire  group  were  rated  Very  Good.  From 
have  the  same  meaning  as  those  at  the  line  (B)  it  is  apparent  that  this  figure 
top  of  the  chart  for  reporting  test  per-  could    have    been    increased    to    34.7%, 

— ■ if   the    122    (52.4%)    trainees   who   were 

tory  criterion  could  be  improved  still  further  by  finally  assigned  to  the  factory  as  mount- 

SSrion. thC   WdghtS   3SSigned    ^    thC   SCh0°'  ers  had  been  selected  by  the  tests.  If  only 
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Fig.  [.  Effect  of  Selection  Ratio  on  the  Percent  of  Trainees  in  Indicated  Rating  Groups, 
When  Selected  by  the  Final  Battery.  (N  =  233) 


30%  had  been  selected,  42%  would  have 
been  rated  Very  Good,  an  increase  of 
approximately  20%.  The  effect  of  smaller 
selection  ratios  may  be  similarly  read 
from  the  graph. 

Obversely,  the  lines  representing  the 
three  lowest  ratings  show  the  extent  to 
which  the  percent  of  poorer  trainees  in 


the  selected  group  can  be  decreased  l>\ 
reductions  in  the  selection  ratio.  Thus, 
while  9.0%  of  the  total  group  were  rated 
Poor,  line  (E)  demonstrates  that  this 
could  have  been  reduced  to  2.0%  by 
operating  with  a  selection  ratio  of  70%. 
Furthermore,  il  the  52.4%  who  were  as- 
signed from  the  school  as  mounters  had 
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been  selected  on   the  basis  of  the  tests,  achieved  through  the  use  of  the  tests.  If 

oiil\  0.8%  would  have  been  rated  Poor.  ;i  selection  ratio  of  <>o"(1  had  been  used, 

a  decrement  of  8.1%.  The  possible  reduc-  the   proportion   of  good   and   excellent 

tions  in   the  percenl   ol    Poor  and   Fair,  operators    would    have    been    increased 

and  of  Poor.  Fair  and  Average  operators  from  62.9%  to  81.0%,  a  gain  of  18.1%. 

are  even  more  striking.  If  a  selection  ratio  of  40%  had  been  op- 
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Fic.  5.  Effect  of  Selection  Ratio  on  the  Value  of  the  Final  Battery  in  Selecting  Good  and 
Excellent  Mounters  for  the  Factory.  (N  —  35) 


The  data  for  the  35  subjects  in  the  fac- 
tory was  treated  in  the  same  way.  It  will 
be  recalled  that  no  subjects  were  rated 
Poor,  thirteen  were  rated  Fair  (unsatis- 
factory) and  twenty-two  were  rated  Good 
and  Excellent  (satisfactory).  Thus  62.9% 
of  the  group  were  satisfactory.  This  is 
indicated  in  Figure  5  by  the  fact  that 
the  line  intersects  the  ordinate  represent- 
ing a  selection  ratio  of  100%,  at  62.9%. 
This  figure  is  undoubtedly  higher  than 
it  would  have  been  if  the  subjects  had 
not  been  screened  through  the  Vestibule 
Training  School.  We  have  seen  the  effect 
of  reduced  dispersion  on  the  correlation 
coefficient.  The  relationship  presented  in 
the  graph  is  similarly  affected.  Despite 
this  limitation,  the  line  shows  that  sub- 
stantial improvements  could  have  been 


erative,  85.7%  would  have  been  rated 
Good  or  Excellent,  an  increment  of 
22.8%. 

It  is  natural  to  ask  at  this  point,  what 
percentage  improvement  is  necessary  to 
pay  the  cost  of  administering  the  tests? 
While  the  kind  of  information  required 
for  the  answer  is  easy  to  specify,  it  is  not 
so  easily  obtained.  However,  since  the 
tests  can  be  administered  on  a  group 
basis,  up  to  ten  or  twelve  applicants  can 
be  tested  conveniently  at  one  time.  If  one 
is  operating  with  a  selection  ratio  of  40%, 
ten  applicants  must  be  tested  for  every 
four  placed  on  mounting  jobs.  Since  the 
tests  can  be  administered  in  less  than 
twenty  minutes,  it  is  doubtful,  under  any 
circumstances,  whether  the  total  cost  of 
testing  this  group  would  be  greater  than 
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one  hour's  salary  for  the  technician.  The 
question  now  becomes,  how  much  better 
must  these  four  people  be  to  pay  for  the 
testing  of  ten?  Although  this  cannot  be 
answered  precisely,  the  amount  must  be 
small  indeed.  The  total  hourly  savings 
through  superiority  in  quantity  or  qual- 
ity of  work  for  the  four  people  must  be 
•  multiplied  by  40  to  put  it  on  a  weekly- 

basis,  and  then  by  the  number  of  weeks 
the  operators  remain  on  the  job  and 
demonstrate  superiority.  Under  these  cir- 
cumstances, almost  any  discernible  im- 
provement should  cover  the  cost  of  test- 
ing. The  attainable  improvements  in  se- 
lection, demonstrated  in  this  study  are 
certainly  adequate  to  justify  their  use. 

I.  THE  EXPERIENCE   HYPOTHESIS 

The  correlations  between  the  test  and 
the  two  criteria  have  been  presented  as 
indicating  a  relationship  between  the 
aptitudes  measured  and  performance  on 
the  job.  As  an  alternative  to  accepting 
this  major  thesis,  it  may  be  argued  that 
the  empirical  correlations  were  the  re- 
sult of  differential  experience  operating 
as  a  spurious  factor.  It  was  noted  in 
Chapter  IV,  that  the  subjects  were  drawn 
from  a  highlv  industrialized  area  which 
includes  numerous  establishments  manu- 
facturing miscellaneous  small  assemblies 
and  several  producing  vacuum  tubes. 
Trainees  having  previous  experience  as 
mounters  were  eliminated  with  reason- 
able certainty  through  the  experimental 
procedures.  Rehires  to  the  company  were 
also  excluded  regardless  of  position  held. 
4l  However,  these  precautions  do  not  neces- 

sarily provide  adequate  assurance  that  nil 
experience   which   might   spuriously   in- 
flate empirical  correlations  has  been  con 
trolled. 

In  order  to  avoid  the  difficulties  and 
uncertainties  which  would  attend  evalu- 


ating each  subject's  report  of  her  own  ex- 
perience on  the  application  blank,  this 
hypothesis  will  be  rendered  improbable 
by  a  comparative  study  of  the  test  scon  s 
of  applicants  to  a  feeder  plant  ol  the 
company,  established  in  the  Catskill 
Mountains  during  the  war.  While  this 
rural  area  was  not  entirel)  without  man- 
ufacturing enterprises,  it  could  hardly 
be  described  as  industrialized.  There 
were,  moreover,  no  other  radio  tube 
plants  in  the  same  labor  market.  The 
applicants  may  therefore  be  regarded  as 
inexperienced  with  respect  to  skills  which 
may  be  helpful  in  test  and  job  situations. 

Since  the  more  important  results  of  the 
present  experiment  were  known  when 
the  rural  feeder  plant  was  opened,  the 
Purdue  assembly  and  tweezer  dexterity 
tests  were  administered  (along  with  a 
standard  intelligence  test)  as  part  of  the 
selection  and  placement  procedures.  Only 
those  applicants  who  satisfied  all  inter- 
view and  other  requirements  for  the  job 
of  mounter  took  the  tests.  Since  the  test 
scores  played  no  part  in  the  selection  of 
the  233  subjects  in  the  experimental 
group,  the  two  groups  are  comparable  in 
this  respect.  Test  scores  for  the  first  78 
applicants  processed  at  the  rural  plant 
were  treated  shortly  after  operations  were 
begun,  as  a  check  upon  the  reliability  of 
the  procedures  of  test  administration. 

The  data  in  Table  1 1  show  that  the 
scores  of  the  rural  group  have  a  slightly 
higher  mean  and  a  slightly  greater  dis- 
persion on  the  Purdue  assembly  and 
tweezer  dexterity  tests  than  those  of  the 
experimental  group.  It  will  be  noted  thai 
all  differences,  though  small,  are  in  a  di- 
rection contrary  to  that  which  would  be 
expected  on  the  basis  of  the  experience 
hypothesis.  However,  since  the  critical 
ratios  range  from  0.16  to  0.84,  the  differ- 
ences between  the  two  groups  could  easily 
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Table  ii 

Comparative  lest  Performance  of  the  Experimental  Croup  (N  =  233) 
and  a  Group  of  Applicants  to  a  Rural  Plant  (N  =  78) 


Tesi- 

Statistics 

Rural 

Experimen 

tal 

Diff 

an 

Critical 
Ratio 

PA 

in 

M 

<T 

M 

i jo .07 
1748 

100. S5 

127 .02 
17-05 

99-38 
13.21 

»   15 
0.43 

1-45 
0.  20 

2.27 
I.  10 

i-57 
0.86 

0.51 
0.27 

0.84 
0. 16 

have  arisen  by  chance.  None  even  ap- 
proach statistical  significance. '-' 

These  comparisons  by  themselves  are 
not  sufficiently  complete  to  be  conclusive. 
It  is  possible  for  two  distributions  of  vast- 
ly different  shape  to  yield  comparable 
means  and  standard  deviations.  The  chi- 
square  test  was  therefore  applied  to  the 
four  distributions  to  detect  any  signifi- 
cant deviations  from  normality.  Theoreti- 
cal frequencies,  calculated  for  each  cell 
from  the  respective  means  and  standard 

12  Large  sample  statistics  were  used  in  testing 
the  reliability  of  the  differences  between  the  two 
groups.  Since  none  of  the  differences  approach 
significance,  the  more  refined  tests  were  con- 
sidered unnecessary.  Prior  to  publication,  how- 
ever, these  assumptions  were  checked  using 
population  variances  and  taking  into  account 
the  unequal  sizes  of  the  samples  (1).  Application 
of  the  t-test  to  the  differences  between  the  two 
means  yielded  values  of  .51  and  .83  for  the  Pur- 
due assembly  and  tweezer  dexterity  tests  respec- 
tively, as  compared  with  .51  and  .84  for  the  criti- 
cal ratios.  Moreover,  the  ratio  of  the  larger  to 
the  smaller  variances  (F)  became  1.06  for  the 
Purdue  assembly  and  1.04  for  the  tweezer  dex- 
terity test.  None  of  these  values  is  significant  at 
the  5%  level  for  the  degrees  of  freedom  involved. 


deviations,  were  added  successively  from 
either  end  of  the  distributions.  Each  time 
the  total  just  reached  or  exceeded  five,  a 
single  cell  was  constituted.  Because  of  the 
smaller  number  of  subjects  in  the  rural 
group,  a  greater  number  of  step  intervals 
in  the  tails  were  combined.  The  calcu- 
lated values  of  chi-square  and  the  corre- 
sponding degrees  of  freedom  are  entered 
in  Table  12.  Significance  levels  were  de- 
rived by  interpolation  between  published 
values  (5)  and  show  the  probability  that 
chi-squares  as  great  or  greater  than  those 
obtained  could  have  arisen  by  chance.  For 
example,  deviations  from  normality  as 
great  as,  or  greater  than,  those  of  the 
experimental  group  on  the  Purdue  as- 
sembly test  would  arise  by  chance  41% 
of  the  time  when  the  true  shape  of  the 
distribution  is  normal.  Since  none  of  the 
figures  even  approaches  the  5%  level,  we 
may  conclude  that  the  distributions  are 
not  significantly  different  from  normal. 
If  differential  experience  had  been  pres- 


Table  12 

The  Chi-Square  Test  of  Normality  Applied  to  Score  Distributions 
of  Rural  and  Experimental  Subjects 


Tests 


Group 


Degrees 
Freedom 


Level  of 
Significance 


Purdue  Assembly 


Tweezer  Dexterity 


Experimental 
Rural 

Experimental 
Rural 


16.74 
9.  20 

n-55 
13.89 


16 
1 1 

17 
11 


41% 
60% 

82% 
24% 
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cnt  to  the  extent  that  it  materially  af-  lipsticks.  She  received  a  rating  of  Good 

fected  test  performance,  both  the  means  in  the  factory  and  a  predicted  criterion 

and  standard  deviations  of  the  experi-  score  of  69.7.  The  fact  that  one  applicant 

mental   group   would   have   been   larger  with  more  than  five  years'  experience  sew 

than  those  of  the  rural  group.  The  ob-  ing  handbags  scored  below  average  on  the 

tained  differences,  however,  while  in  the  tests  and  received  a  fair  rating  in   the 

opposite  direction,  were  so  slight  as  to  be  factory,  suggests  that  success  in  a  sewing 

insignificant.  The  fact  that  the  chi-square  operation   is  not  necessarily  helpful   to 

test  showed  no  significant  deviations  from  mounters  in  either  the  test  or  the  job 

normality   provides   additional   evidence  situation.  Packing  lipsticks,  moreover,  is 

against  the  experience  hypothesis,  since  hardly  comparable   to  mounting.  Thus 

if  the  aptitudes  measured  are  normally  none  of  the  35  operators  had  previous 

distributed,  any  factor  differentially  af-  industrial  experience  with  tweezers  or  in 

fecting  test  performance  would  lead  to  the  assembly  of  small  and  delicate  parts. 

deviations  from  normality.  The  fact  that  no  direct  experience  in 

Thus  the  hypothesis  that  the  empirical  mounting  was  found  in  this  sample  serves 

correlations  arose  spuriously  from  the  ef-  as  a  check  on  the  reliability  of  the  experi- 

fect  of  differential   experience  on   both  mental  procedures  for  removing  the  spu- 

test  and  job  performance  appears  highly  rious  effect  of  this  factor.  The  weight  of 

improbable.  this  evidence  is  enhanced  when  one  con- 

The  work  experience  of  the  35  subjects  siders  that  the  35  subjects  rated  constitute 

included  in  the  factory  follow-up,  on  the  a  sample  from  the  59  cases  (p.  29f)  where 

other  hand,  was  checked  on  an  individual  experience,  if  any,  was  most  likely  to  be 

basis.  Of  this  group,  22  had  no  previous  found.  It  will  be  recalled  that  122  of  the 

work  experience.  Nine  of  the  remaining  original  233  trainees  were  assigned  from 

13   had   no   industrial   experience.   This  the  school  as  mounters.  Since  skill  in  this 

group  included  five  salesgirls,  two  domes-  operation  was  at  a  premium  during  this 

tic  workers,  one  waitress  and  one  assistant  period,  one  may  safely  assume  that  the 

to  a  beautician.  Only  four  had  had  any  remaining    111    subjects    were    without 

industrial  experience  at  the  time  of  hir-  helpful  experience.  Of  the  122  assigned 

ing.  Two  of  these  had  performed  an  un-  to  mounting,  63  left  the  company  before 

specified  sewing  operation  on  handbags  the  ratings  were  made.  The  bulk  of  these 

for   a   number  of  years.    However,   one,  subjects  were  vacation  workers  from  the 

with  a  predicted  criterion  score  of  55.6  schools  and  colleges.  Thus,  any  subject 

(on  scale   for   overall   scores,   Figure   3),  who   had   returned    to   mounting,    after 

was  rated  Good  in  the  factory,  and  the  previous  experience  with  it,  would  more 

other,  whose  score  was  38.1,  was  rated  likely  be  found  among  the  remaining  59 

Fair.   The   third  described  her  work  as  operators  than  in  any  of  the  other  groups, 

"sewing  machine  operator"  without  speci-  It  is  probable,  therefore,  that  a  detailed 

fying  the  product.   She  received  a  pre-  check  of  the  application  blanks  of  the 

dieted  score  of  56.2  and  was  rated  as  Ex-  233  subjects  would  have  revealed  no  ex 

cellent.  However,  she  had  left  the  sewing  pcrienced  mounters  and  that  the  experi- 

job  after  trying  it  for  only  two  weeks.  The  mental  controls  on  this  factor  were  ade- 

fourth  had  worked  eight  months  packing  quate. 


VI.  Summary  and  Conclusions 


Tin  assembly  of  small  radio  tube  pails 
by  the  process  of  resistance  welding 
was  seen  to  require  better  than  average 
ability  in  the  manipulation  of  small  and 
delicate  parts  with  lingers  and  tweezers. 
A  detailed  description  of  the  operation 
served  to  indicate  the  high  degree  of 
control  which  must  be  exercised  over  the 
movements  of  the  hands  and  fingers  in 
positioning  the  parts  between  the  two 
parallel  electrode  contact  surfaces.  Al- 
though other  aptitudes  and  worker  char- 
acteristics appeared  to  be  involved,  the 
present  study  was  confined  to  the  evalua- 
tion of  five  standard  manipulation  tests, 
namely,  the  Minnesota  Rate  of  Manipu- 
lation Tests,  Turning  and  Placing,  the 
O'Connor  Finger  Dexterity  and  Tweezer 
Dexterity  tests,  and  the  assembly  subtest 
of  the  Purdue  Pegboard.  These  tests  were 
administered  on  a  time  limit  basis  to  233 
prospective  mounters  at  the  time  of  hir- 
ing. The  scores  achieved  were  later  cor- 
related with  a  criterion  of  performance 
in  the  Vestibule  Training  School,  con- 
sisting of  the  pooled  judgments  of  the 
school  supervisor  and  the  staff  of  instruc- 
tors. 

Several  types  of  data  were  analyzed  in 
order  to  evaluate  the  criterion  of  per- 
formance in  the  school.  The  intercorre- 
lations  among  three  sets  of  ranks  from 
a  previous  experiment,  showed  that  the 
raters  were  capable  of  making  reliable 
judgments  of  trainee  performance  on  the 
basis  of  recorded  information  such  as  pro- 
duction test  scores,  number  of  defective 
units  produced  and  other  notations  made 
in  the  course  of  training.  Overall  rank- 
ings, moreover,  were  found  to  correlate 
highly  with  separate  judgments  of  quan- 
tity and  quality.  The  correlations  be- 
tween criterion  ratings  and  production 
test  scores  were  taken  to  indicate  both  the 


reliability  and  validity  of  the  criterion. 
The  fact  that  production  test  scores  were 
reasonably  reliable  measures  of  the  apti- 
tudes involved  in  the  performance  of 
each  operation  was  interpreted  as  further 
evidence  of  the  validity  of  the  overall 
ratings  based  upon  them. 

Correlations  between  the  criterion  and 
the  respective  tests  ranged  from  .482  to 
.636,  far  exceeding  the  requirements  for 
significance  at  the  1%  level  for  231  de- 
grees of  freedom.  Application  of  the 
Wherry-Doolittle  Test  Selection  Method 
showed  that,  while  the  shrunken  multiple 
correlation  coefficient  was  a  maximum 
when  the  battery  included  all  five  tests, 
the  turning  and  finger  dexterity  tests 
contributed  little  to  forecasting  efficiency. 
Two  multiple  regression  equations  were 
then  derived.  The  three-test  equation, 
utilizing  scores  from  the  Purdue  assem- 
bly, tweezer  dexterity  and  placing  tests, 
wras  found  to  account  for  57.0%  of  the 
criterion  variance.  The  two-test  equation, 
which  weighted  scores  on  only  the  first 
two  of  these  tests,  accounted  for  51.8%  of 
the  criterion  variance,  and  was  recom- 
mended for  occasions  when  only  limited 
time  was  available  for  testing. 

In  order  to  separate  the  results  which 
appear  to  have  worthwhile  practical  ap- 
plications from  those  of  limited  value, 
the  predictions  made  by  the  two  equa 
tions  were  further  validated  against  a  cri- 
terion of  factory  performance.  Even 
though  criterion  data  were  available  for 
only  35  members  of  the  original  group  of 
233  trainees,  the  data  were  adequate  to 
demonstrate  that  the  predictions  of  the 
two-test  equation  are  significantly  related 
to  factory  performance  at  better  than  the 
1%  level.  The  forecasts  of  the  three-test 
equation,  on  the  other  hand,  barely 
missed  significance  at  the  5%  level  prior 
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to  correction  for  reduced  dispersion  and  the  experience  hypothesis.  An  individual 
somewhat  exceeded  this  minimum  re-  check  on  the  work  experience  of  the  35 
quirement  when  corrected.  On  the  basis  mounters  who  were  rated  in  the  fact<n\ 
of  these  results,  the  two-test  equation  was  revealed  that  none  had  had  previous  in- 
recommended  for  application  in  the  em-  dustrial  experience  with  tweezers  or  in 
ployment  office.  There  appeared  to  be  lit-  the  assembly  of  small  and  delicate  parts, 
tie  point  in  including  the  placing  test  in  A  chart  was  developed  for  presenting 
the  battery  to  account  for  an  additional  test  results  to  the  employment  office  in  a 
;,._'%  of  criterion  variance  in  the  training  precise  and  readily  comprehensible  form, 
school,  if  the  procedure  entails  a  loss  in  In  addition,  it  provides  the  interviewers 
forecasting  efficiency  with  respect  to  fac-  with  the  selection  ratio  implied  by  the 
tory  performance.  acceptance  of  any  individual  applicant 

The  predictions  of  the  two-test  equa-  During  the  period  immediately  following 

tion  were  then  analyzed  graphically  to  the  installation  of  tests,  the  chart  served 

show  the  improvements  in  the  percent  of  as  a  training  device,  showing  not  only  the 

superior  employees  who  would  have  been  magnitude  of  individual  differences,  but 

hired,   operating  with  various  selection  the  nature  of  their  distribution, 

ratios.    It   was   demonstrated   that,   with  Viewing  the  results  in  perspective,  it 

selection  ratios  of  moderate  size,  substan-  would  be  naive  to  pretend  that  two  dex- 

tial  increases  were  attainable  in  the  per-  terity  tests  will  solve  all  selection  prob- 

cent  of  superior  employees  placed  in  both  lems   for  the  occupation  of  radio   tube 

the  training  school  and  the  factory.  mounter.   Valuable,   as   they  have   been 

The  comparisons  of  the  scores  of  the  shown  to  be,  there  remains  a  sizeable 
experimental  subjects  with  a  group  of  amount  of  residual  variance  to  be  ex- 
applicants  to  a  rural  feeder  plant  served  plained.  Available  measures  of  intelli- 
as  a  check  on  both  the  procedures  for  gence  and  temperament,  such  as  those 
eliminating  experienced  mounters  and  employed  in  the  study  by  Forlano  and 
the  possible  spurious  effect  of  other  in-  Kirkpatrick  (2),  should  be  validated  with 
dustrial  training.  The  absence  of  any  sig-  larger  samples.  The  investigation  of  in- 
nificant  difference  between  the  two  dividual  differences  in  perceptual  and 
groups  and  the  fact  that  none  of  the  score  visual  aptitudes  may  also  prove  fruitful, 
distributions  deviated  significantly  from  The  possibility  of  devising  new  and  per- 
normal,  were  interpreted  as  rendering  the  haps  better  manipulation  tests  must  not 
experience  hypothesis  highly  improbable,  be  overlooked.  Finally,  the  development 
Some  additional  confidence  in  these  con-  of  techniques  to  increase  the  percent  of 
elusions  may  be  derived  from  the  fact  mounters  who  make  full  use  of  their  ap- 
that  the  empirical  differences  between  the  titudes  in  the  daily  performance  of  their 
means  and  standard  deviations,  though  jobs  and  who  derive  a  reasonable  amount 
slight,  were  in  a  direction  opposite  to  that  of  satisfaction  therefrom  is  an  even  more 
which  would  be  expected  on  the  basis  of  challenging  field  for  future  research. 
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